bluesearch.widgets.article_saver module¶
Module for the article_saver.
- class ArticleSaver(connection)[source]¶
Bases:
object
Keeps track of selected articles.
This class can be used to save a number of articles and paragraphs for a later use. A typical use case is to keep track of the items selected in the search widget, and to retrieve them later in the mining widget.
Furthermore this class allows to print a summary of all selected items using the summary_table method, to resolve all items into paragraphs with the corresponding section name and to summarize them in a pandas data frame using the method get_chosen_texts, and to export a PDF report of all saved items using the method report.
- Parameters
connection (sqlalchemy.engine.Engine) – An SQL database connectable compatible with pandas.read_sql. The database is supposed to have paragraphs and articles tables.
- connection¶
An SQL database connectable compatible with pandas.read_sql. The database is supposed to have paragraphs and articles tables.
- Type
sqlalchemy.engine.Engine
- state¶
The state that keeps track of saved items. It is a set of tuples of the form (article_id, paragraph_id) each representing one saved item. The items with paragraph_id = -1 indicate that the whole article should be saved.
- Type
set
- state_hash¶
A hash uniquely identifying a certain state. This is used to cache df_chosen_texts and avoid recomputing it if the state has not changed.
- Type
int or None
- df_chosen_texts¶
The rows represent different paragraphs and the columns are ‘article_id’, ‘section_name’, ‘paragraph_id’, ‘text’.
- Type
pd.DataFrame
- add_paragraph(article_id, paragraph_pos_in_article)[source]¶
Save a paragraph.
- Parameters
article_id (int) – The article ID.
paragraph_pos_in_article (int) – The paragraph ID.
- get_chosen_texts()[source]¶
Retrieve the currently saved items.
For all entire articles that are saved the corresponding paragraphs are resolved first.
- Returns
df_chosen_texts
- Return type
pandas.DataFrame
- get_saved_items()[source]¶
Retrieve the saved items that summarize the choice of the users.
- Returns
identifiers – Tuple (article_id, paragraph_pos_in_article) chosen by the user.
- Return type
list of tuple
- has_article(article_id)[source]¶
Check if an article has been saved.
- Parameters
article_id (int) – The article ID.
- Returns
result – Whether or not the given article has been saved.
- Return type
bool
- has_paragraph(article_id, paragraph_pos_in_article)[source]¶
Check if a paragraph has been saved.
- Parameters
article_id (int) – The article ID.
paragraph_pos_in_article (int) – The paragraph ID.
- Returns
result – Whether or not the given paragraph has been saved.
- Return type
bool
- make_report(output_dir=None)[source]¶
Create the saved articles report.
- Parameters
output_dir (str or pathlib.Path) – The directory for writing the report.
- Returns
output_file_path – The file to which the report was written.
- Return type
pathlib.Path
- remove_article(article_id)[source]¶
Remove an article from saved.
- Parameters
article_id (int) – The article ID.