bluesearch.widgets.mining_widget module

Module for the mining widget.

class MiningWidget(**kwargs)[source]

Bases: ipywidgets.widgets.widget_box.VBox

The mining widget.

Parameters
  • mining_server_url (str) – The URL of the mining server.

  • mining_schema (bluesearch.widgets.MiningSchema) – The requested mining schema (entity, relation, attribute types).

  • article_saver (bluesearch.widgets.ArticleSaver) – An instance of the article saver.

  • default_text (string, optional) – The default text assign to the text area.

  • use_cache (bool) – If True the mining server will use cached mining results stored in an SQL database. Should lead to major speedups.

  • checkpoint_path (str or pathlib.Path, optional) – Path where checkpoints are saved to and loaded from. If None, defaults to ~/.cache/bluesearch/widgets_checkpoints folder.

get_extracted_table()[source]

Retrieve the table with the mining results.

Returns

results_table – The table with the mining results.

Return type

pandas.DataFrame

textmining_pipeline(information, schema_df, debug=False)[source]

Handle text mining server requests depending on the type of information.

Parameters
  • information (str or list.) – Information can be either a raw string text, either a list of tuples (article_id, paragraph_id) related to the database.

  • schema_df (pd.DataFrame) – A dataframe with the requested mining schema (entity, relation, attribute types).

  • debug (bool) – If True, columns are not necessarily matching the specification. However, they contain debugging information. If False, then matching exactly the specification.

Returns

table_extractions – The final table. If debug=True then it contains all the metadata. If False then it only contains columns in the official specification.

Return type

pd.DataFrame