nltk.corpus.reader.TEICorpusView¶
- class nltk.corpus.reader.TEICorpusView[source]¶
Bases:
StreamBackedCorpusView
- __init__(corpus_file, tagged, group_by_sent, group_by_para, tagset=None, head_len=0, textids=None)[source]¶
Create a new corpus view, based on the file
fileid
, and read withblock_reader
. See the class documentation for more information.- Parameters
fileid – The path to the file that is read by this corpus view.
fileid
can either be a string or aPathPointer
.startpos – The file position at which the view will start reading. This can be used to skip over preface sections.
encoding – The unicode encoding that should be used to read the file’s contents. If no encoding is specified, then the file’s contents will be read as a non-unicode string (i.e., a str).
- read_block(stream)[source]¶
Read a block from the input stream.
- Returns
a block of tokens from the input stream
- Return type
list(any)
- Parameters
stream (stream) – an input stream
- close()[source]¶
Close the file stream associated with this corpus view. This can be useful if you are worried about running out of file handles (although the stream should automatically be closed upon garbage collection of the corpus view). If the corpus view is accessed after it is closed, it will be automatically re-opened.
- property fileid¶
The fileid of the file that is accessed by this view.
- Type
str or PathPointer