nyawc.http package

Submodules

nyawc.http.Handler module

class nyawc.http.Handler.Handler(options, queue_item)[source]

Bases: object

The Handler class executes HTTP requests.

__options[source]

obj – The settins/options object.

__queue_item[source]

obj – The queue item containing a request to execute.

_Handler__content_type_matches(content_type, available_content_types)[source]

Check if the given content type matches one of the available content types.

Parameters:
  • content_type (str) – The given content type.
  • list (available_content_types) – All the available content types.
Returns:

True if a match was found, False otherwise.

Return type:

bool

_Handler__get_all_scrapers()[source]

Find all available scraper references.

Returns:The scraper references.
Return type:list(obj)
_Handler__get_all_scrapers_modules()[source]

Find all available scraper modules.

Returns:The scraper modules.
Return type:list(obj)
_Handler__make_request(url, method, data, auth, cookies, headers, proxies)[source]

Execute a request with the given data.

Parameters:
  • url (str) – The URL to call.
  • method (str) – The method (e.g. get or post).
  • data (str) – The data to call the URL with.
  • auth (obj) – The authentication class.
  • cookies (obj) – The cookie dict.
  • headers (obj) – The header dict.
  • proxies (obj) – The proxies dict.
Returns:

The response object.

Return type:

obj

__init__(options, queue_item)[source]

Construct the HTTP handler.

Parameters:
get_new_requests()[source]

Retrieve all the new request that were found in this request.

Returns:A list of request objects.
Return type:list(nyawc.http.Request)

nyawc.http.Request module

class nyawc.http.Request.Request(url, method='get', data=None, auth=None, cookies=None, headers=None, proxies=None)[source]

Bases: object

The Request class contains details that were used to request the specified URL.

METHOD_OPTIONS[source]

str – A request method that can be used to request the URL.

METHOD_GET[source]

str – A request method that can be used to request the URL.

METHOD_HEAD[source]

str – A request method that can be used to request the URL.

METHOD_POST[source]

str – A request method that can be used to request the URL.

METHOD_PUT[source]

str – A request method that can be used to request the URL.

METHOD_DELETE[source]

str – A request method that can be used to request the URL.

parent_raised_error[source]

bool – If the parent request raised an error (e.g. 404).

depth[source]

int – The current crawling depth.

url[source]

str – The absolute URL to use when making the request.

method[source]

str – The request method to use for the request.

data[source]

obj – The post data {key: value} OrderedDict that will be sent.

auth[source]

obj – The (requests module) authentication class to use for the request.

cookies[source]

obj – The (requests module) cookie jar to use for the request.

headers[source]

obj – The headers {key: value} to use for the request.

proxies[source]

obj – The proxies {key: value} to use for the request.

METHOD_DELETE = 'delete'[source]
METHOD_GET = 'get'[source]
METHOD_HEAD = 'head'[source]
METHOD_OPTIONS = 'options'[source]
METHOD_POST = 'post'[source]
METHOD_PUT = 'put'[source]
__init__(url, method='get', data=None, auth=None, cookies=None, headers=None, proxies=None)[source]

Constructs a Request instance.

Parameters:
  • url (str) – The absolute URL to use when making the request.
  • method (str) – The request method to use for the request.
  • data (obj) – The post data {key: value} OrderedDict that will be sent.
  • auth (obj) – The (requests module) authentication class to use for the request.
  • cookies (obj) – The (requests module) cookie jar to use for the request.
  • headers (obj) – The headers {key: value} to use for the request.
  • proxies (obj) – The proxies {key: value} to use for the request.

nyawc.http.Response module

class nyawc.http.Response.Response(url)[source]

Bases: object

Response placeholder class for before request is finished.

url[source]

str – The absolute URL of the request/response.

Note

This class will be replaced with the response class of Python’s requests module when the request is finished. For more information check http://docs.python-requests.org/en/master/api/#requests.Response.

__init__(url)[source]

Constructs a Response instance.

Parameters:url (str) – The absolute URL of the request/response.