Advanced Usage Examples
This section holds examples on achieving different implementations and methodologies using Scrapingpass API.
JS Rendering
Sometimes its required to make use of headless browsers aka JS Rendering for client side rendered web applications. Scrapingpass provides lets you get either HTML or screenshot, PDF, etc through JS Rendering.
HTML through JS Rendering
HTTP REST API
- cURL
- Python
curl 'https://api.scrapingpass.com?api_key=YOUR-API-KEY&url=https://example.com/&js_rendering=true&html=true'
import requests
url = "https://api.scrapingpass.com"
options = {
"api_key": "YOUR-API-KEY",
"url": "https://example.com/",
"html": True,
"js_rendering": True,
}
response = requests.get(url, params=options)
HTTP Proxy Mode
- cURL
- Python
curl --insecure \
--proxy 'http://YOUR-API-KEY:scrapingpass&js_rendering=true&[email protected]:8080' \
'https://example.com/'
import requests
proxy_url = "http://YOUR-API-KEY:scrapingpass&js_rendering=true&[email protected]:8080"
proxies = {"http": proxy_url, "https": proxy_url}
response = requests.get("https://example.com/", proxies=proxies, verify=False)
PDF through JS Rendering
HTTP REST API
- cURL
- Python
curl 'https://api.scrapingpass.com?api_key=YOUR-API-KEY&url=https://example.com/&pdf=true'
import requests
url = "https://api.scrapingpass.com"
options = {
"api_key": "YOUR-API-KEY",
"url": "https://example.com/",
"pdf": True,
}
response = requests.get(url, params=options)
HTTP Proxy Mode
- cURL
- Python
curl --insecure \
--proxy 'http://YOUR-API-KEY:scrapingpass&[email protected]:8080' \
'https://example.com/'
import requests
proxy_url = "http://YOUR-API-KEY:scrapingpass&[email protected]:8080"
proxies = {"http": proxy_url, "https": proxy_url}
response = requests.get("https://example.com/", proxies=proxies, verify=False)
Forwarding Headers
Scrapingpass allows you to forward any desired set of headers to the remote. To forward headers, you need to specify forward_headers=true
and Scrapingpass will smartly guess the headers to forward while dropping the headers which would have meant for Scrapingpass API itself. However, please do note that this way can be slightly error prone. If you experience any issue, you may use other following rigorous ways of forwarding headers.
To ensure that all correct headers are forwarded to the remote, you can also prefix the headers you wish forward with Sp-
and Scrapingpass will automatically strip the Sp- prefix and forward it to remote. For example, if you wish to forward Referer: https://example.com/
header, you can include it as header Sp-Referer: https://example.com/
while making request to Scrapingpass API and it will forward correct Referer header to the remote. Note that using this way, Scrapingpass will always forward headers prefixed with Sp- independent whatever specified to forward_headers.
tip
If you're using HTTP Proxy Mode then you don't need to do anything special. You can specify headers as same way as if you were sending request directly to the remote yourself.
HTTP REST API
- cURL
- Python
# When using the forward_headers=true option.
curl 'https://api.scrapingpass.com?api_key=YOUR-API-KEY&url=https://httpbin.org/headers&forward_headers=true' \
--header 'Your-Header-Key: YourHeaderValue'
# When using the Sp- prefix for headers to forward.
curl 'https://api.scrapingpass.com?api_key=YOUR-API-KEY&url=https://httpbin.org/headers' \
--header 'Sp-Your-Header-Key: YourHeaderValue'
import requests
url = "https://api.scrapingpass.com"
# When using the forward_headers=true option.
options = {
"api_key": "YOUR-API-KEY",
"url": "https://httpbin.org/headers",
"forward_headers": True,
}
headers = {
'Your-Header-Key': 'YourHeaderValue'
}
# When using the Sp- prefix for headers to forward.
# options = {
# "api_key": "YOUR-API-KEY",
# "url": "https://httpbin.org/headers",
# }
# headers = {
# 'Sp-Your-Header-Key': 'YourHeaderValue'
# }
response = requests.get(url, params=options, headers=headers)
HTTP Proxy Mode
- cURL
- Python
curl --insecure \
--proxy 'http://YOUR-API-KEY:scrapingpass&@proxy.scrapingpass.com:8080' \
--header 'Your-Header-Key: YourHeaderValue' \
'https://httpbin.org/headers'
import requests
proxy_url = "http://YOUR-API-KEY:scrapingpass&[email protected]:8080"
proxies = {"http": proxy_url, "https": proxy_url}
headers = {
'Your-Header-Key': 'YourHeaderValue'
}
response = requests.get("https://httpbin.org/headers", headers=headers, proxies=proxies, verify=False)
Making POST/PUT Requests
When its required to capture the state after some kind of form submit which usually involves making POST/PUT requests to remote along with some form data. Scrapingpass allows you to easily achieve this.
You simply need to make request to API with POST/PUT method along with your request body or form data. Scrapingpass will transparently forward the same request body to remote while making POST/PUT request.
HTTP REST API
- cURL
- Python
curl --request POST \
--form 'your_form_data_field="YourFormFieldValue"' \
'https://api.scrapingpass.com?api_key=YOUR-API-KEY&url=https://httpbin.org/anything'
import requests
url = "https://api.scrapingpass.com"
options = {
"api_key": "YOUR-API-KEY",
"url": "https://httpbin.org/anything"
}
payload={'your_form_data_field': 'YourFormFieldValue'}
response = requests.post(url, params=options, data=payload)
HTTP Proxy Mode
- cURL
- Python
curl --insecure \
--request POST \
--proxy 'http://YOUR-API-KEY:scrapingpass&@proxy.scrapingpass.com:8080' \
--form 'your_form_data_field="YourFormFieldValue"' \
'https://httpbin.org/anything'
import requests
proxy_url = "http://YOUR-API-KEY:scrapingpass&@proxy.scrapingpass.com:8080"
proxies = {"http": proxy_url, "https": proxy_url}
payload={'your_form_data_field': 'YourFormFieldValue'}
response = requests.post("https://httpbin.org/anything", data=payload, proxies=proxies, verify=False)