bluesearch.database.mesh module¶
Utilities for handling MeSH topic data.
- class MeSHTree(tree_number_to_label: dict[str, str])[source]¶
Bases:
objectThe hierarchical tree of MeSH topics.
The MeSH topic ontology forms a tree with most general topics at the root and the most specific topics as the leaves. Here’s a part of a MeSH topic hierarchy
Natural Science Disciplines [H01] Biological Science Disciplines [H01.158] Biology [H01.158.273] Botany [H01.158.273.118] Ethnobotany [H01.158.273.118.299] Pharmacognosy [H01.158.273.118.598] Herbal Medicine [H01.158.273.118.598.500] Cell Biology [H01.158.273.160] ...The full data can be found in the NLM’s MeSH browser under https://meshb.nlm.nih.gov/.
The topics are uniquely identified by their tree number (e.g. H01.158), while the same topic label can appear in different places.
- Parameters
tree_number_to_label – The MeSH tree data. This dictionary should have tree numbers (e.g. H01.158.273) as keys, and topic labels (e.g. Biology) as values.
- classmethod load(path: pathlib.Path | str) MeSHTree[source]¶
Initialise the MeSH tree from a JSON file.
- Parameters
path – The path to the JSON file containing the MeSH tree data. See the tree_number_to_label parameter of the MeSHTree constructor for the data specification.
- Returns
An initialised instance of the MeSHTree.
- Return type
- parent_topics(topic: str) set[str][source]¶
Find all parent topic labels of a given topic.
Note that a topic label does not have to be unique and can be assigned to multiple tree numbers. This method resolves all parent topics from all tree numbers that have the given label.
- Parameters
topic – A MeSH topic label.
- Returns
All parent topic labels.
- Return type
list
- static parents(tree_number: str) Generator[str, None, None][source]¶
Generate all parent tree numbers.
For example, given the tree number H01.158.273 the parent tree numbers are H01.158 and H01.
- Parameters
tree_number – A MeSH tree number, e.g. H01.158.273.
- Yields
The tree numbers of all parents of the given tree number.
- parse_tree_numbers(nt_stream: TextIO) dict[str, str][source]¶
Parse the MeSH topic tree from a stream of MeSH RDF N-tuples.
- Parameters
nt_stream – A text stream of MeSH RDF N-tuples. This is intended to work with the content of the MeSH files downloaded from the following website: https://nlmpubs.nlm.nih.gov/projects/mesh/rdf
- Returns
A dictionary representing the parsed MeSH topic tree. The keys are the tree numbers that uniquely identify a topic. The values are the corresponding topic labels. Note that the topic labels are not unique. For example, the two tree numbers F04.096.628.255.500 and H01.158.610.030 have both the same label “Cognitive Neuroscience”.
- Return type
dict[str, str]
- resolve_parents(topics: Iterable[str], mesh_tree: MeSHTree) set[str][source]¶
Enhance the topic list by parents of all given topics.
- Parameters
topics – A collection of MeSH topics.
mesh_tree – An instance of MeSHTree.
- Returns
A set with the input topics and all their parent topics.
- Return type
set[str]