nltk.twitter.TweetWriter

class nltk.twitter.TweetWriter[source]

Bases: TweetHandlerI

Handle data by writing it to a file.

__init__(limit=2000, upper_date_limit=None, lower_date_limit=None, fprefix='tweets', subdir='twitter-files', repeat=False, gzip_compress=False)[source]

The difference between the upper and lower date limits depends on whether Tweets are coming in an ascending date order (i.e. when streaming) or descending date order (i.e. when searching past Tweets).

Parameters
  • limit (int) – number of data items to process in the current round of processing.

  • upper_date_limit (tuple) – The date at which to stop collecting new data. This should be entered as a tuple which can serve as the argument to datetime.datetime. E.g. upper_date_limit=(2015, 4, 1, 12, 40) for 12:30 pm on April 1 2015.

  • lower_date_limit (tuple) – The date at which to stop collecting new data. See upper_data_limit for formatting.

  • fprefix (str) – The prefix to use in creating file names for Tweet collections.

  • subdir (str) – The name of the directory where Tweet collection files should be stored.

  • repeat (bool) – flag to determine whether multiple files should be written. If True, the length of each file will be set by the value of limit. See also handle().

  • gzip_compress – if True, output files are compressed with gzip.

timestamped_file()[source]
Returns

timestamped file name

Return type

str

handle(data)[source]

Write Twitter data as line-delimited JSON into one or more files.

Returns

return False if processing should cease, otherwise return True.

Parameters

data – tweet object returned by Twitter API

on_finish()[source]

Actions when the tweet limit has been reached

do_continue()[source]

Returns False if the client should stop fetching Tweets.

check_date_limit(data, verbose=False)[source]

Validate date limits.

counter

A flag to indicate to the client whether to stop fetching data given some condition (e.g., reaching a date limit).

do_stop

Stores the id of the last fetched Tweet to handle pagination.