Jump to content

Recommended Posts

  • Administrators
Posted

So this one has been getting to me for a while now.. There is a site that I am trying to log in to, go to a reports page after a successful login, and then download up to 3 reports that are linked on that page.

 

 

The problem that I'm having is that the reports are randomly named, and I can't get curl to download them.

 

I'll post the html file that the site spit back out to me.. Can someone see how to download the 3 files that are on that page without hardcoding the file name?

 

The 3 reports are named:

 

My Company Invoice Upload_516_146_2012391728360

 

My Company Invoice Upload_516_146_20123917221402

 

My Company Patient Upload_516_146_2012391723258

 

My Scheduled Reports.htm

Posted

If you are scraping these out with regex:

/Home/frmMyReportOpen.aspx?FileName=018a257a-4fa1-42e3-af0d-f818e4dfda3c.csv&FileExtension=CSV&FilePath=192.168.9.95BTReports

and you are trying to get the csv file with a new cURL call you have to make the

&

into a "&" with html_entity_decode

http://php.net/manual/en/function.html-entity-decode.php

 

And you have to prepend the base url to that one as well as it is a relative path.

  • Administrators
Posted

It's probably gonna take me a minute to digest what you are saying, but Thank You, and I will see if I can figure it out... Do you know of a tutorial that can tell me how to scrape with regex? I'm going to google it but if you have a good one in mind I'd like to hear it

 

Thanks!

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...