| |
- get_capture_groups(hit, pos_tags=False)
- Get CQL labeled groups from a search result
Parameters
----------
hit : dict
The returned data of the BlackLab api, see search().
pos_tags : bool, optional
Whether to include PoS tags in the returned data, by default False,
which only includes word forms.
Returns
-------
dict
A dictionary with keys matching the CQL labeled group name and
values the corresponding captured keywords captured in a CQL named group.
- search(pattern, cql_enable=True, board=None, year_from=None, year_to=None, number=20, wordsaroundhit=0, chunk_size=500, get_meta=False)
- Wrapper function to query BlackLab API
Parameters
----------
pattern : str,
Pattern to search for
cql_enable : bool, optional
Whether `pattern` is written in CQL, by default True
board : str, optional
Limit search to a particular board, by default None
year_from : int, optional
Limit search to dates after `yyyy-01-01`, where yyyy equals `year_from`, by default None
year_to : int, optional
Limit search to dates before `yyyy-12-31`, where yyyy equals `year_from`, by default None
number : int, optional
Number of results to return, by default 20.
If None, returns all matching results found in the corpus (BE CAREFUL, this place a heavy load
on the BlackLab server if there is a large number of matching results).
wordsaroundhit : int, optional
The number of tokens around the keywords to return, by default 0
chunk_size : int, optional
Number of results to return for each iteration of query, by default 500
get_meta : bool, optional
Whether to return meta data, by default False
Returns
-------
tuple
If get_meta is False, returns a tuple of length two: (hits, requested_urls).
If get_meta is True, returns a tuple of length three: (hits, metadata, requested_urls).
- top_n(freq_table: dict, n=25)
|