bluesearch.widgets.mining_widget module¶
Module for the mining widget.
- class MiningWidget(**kwargs)[source]¶
Bases:
ipywidgets.widgets.widget_box.VBox
The mining widget.
- Parameters
mining_server_url (str) – The URL of the mining server.
mining_schema (bluesearch.widgets.MiningSchema) – The requested mining schema (entity, relation, attribute types).
article_saver (bluesearch.widgets.ArticleSaver) – An instance of the article saver.
default_text (string, optional) – The default text assign to the text area.
use_cache (bool) – If True the mining server will use cached mining results stored in an SQL database. Should lead to major speedups.
checkpoint_path (str or pathlib.Path, optional) – Path where checkpoints are saved to and loaded from. If None, defaults to ~/.cache/bluesearch/widgets_checkpoints folder.
- get_extracted_table()[source]¶
Retrieve the table with the mining results.
- Returns
results_table – The table with the mining results.
- Return type
pandas.DataFrame
- textmining_pipeline(information, schema_df, debug=False)[source]¶
Handle text mining server requests depending on the type of information.
- Parameters
information (str or list.) – Information can be either a raw string text, either a list of tuples (article_id, paragraph_id) related to the database.
schema_df (pd.DataFrame) – A dataframe with the requested mining schema (entity, relation, attribute types).
debug (bool) – If True, columns are not necessarily matching the specification. However, they contain debugging information. If False, then matching exactly the specification.
- Returns
table_extractions – The final table. If debug=True then it contains all the metadata. If False then it only contains columns in the official specification.
- Return type
pd.DataFrame