bluesearch.database.mesh module

Utilities for handling MeSH topic data.

class MeSHTree(tree_number_to_label: dict[str, str])[source]

Bases: object

The hierarchical tree of MeSH topics.

The MeSH topic ontology forms a tree with most general topics at the root and the most specific topics as the leaves. Here’s a part of a MeSH topic hierarchy

Natural Science Disciplines [H01]
    Biological Science Disciplines [H01.158]
        Biology [H01.158.273]
            Botany [H01.158.273.118]
                Ethnobotany [H01.158.273.118.299]
                Pharmacognosy [H01.158.273.118.598]
                    Herbal Medicine [H01.158.273.118.598.500]
            Cell Biology [H01.158.273.160]
...

The full data can be found in the NLM’s MeSH browser under https://meshb.nlm.nih.gov/.

The topics are uniquely identified by their tree number (e.g. H01.158), while the same topic label can appear in different places.

Parameters

tree_number_to_label – The MeSH tree data. This dictionary should have tree numbers (e.g. H01.158.273) as keys, and topic labels (e.g. Biology) as values.

classmethod load(path: pathlib.Path | str) MeSHTree[source]

Initialise the MeSH tree from a JSON file.

Parameters

path – The path to the JSON file containing the MeSH tree data. See the tree_number_to_label parameter of the MeSHTree constructor for the data specification.

Returns

An initialised instance of the MeSHTree.

Return type

MeSHTree

parent_topics(topic: str) set[str][source]

Find all parent topic labels of a given topic.

Note that a topic label does not have to be unique and can be assigned to multiple tree numbers. This method resolves all parent topics from all tree numbers that have the given label.

Parameters

topic – A MeSH topic label.

Returns

All parent topic labels.

Return type

list

static parents(tree_number: str) Generator[str, None, None][source]

Generate all parent tree numbers.

For example, given the tree number H01.158.273 the parent tree numbers are H01.158 and H01.

Parameters

tree_number – A MeSH tree number, e.g. H01.158.273.

Yields

The tree numbers of all parents of the given tree number.

parse_tree_numbers(nt_stream: TextIO) dict[str, str][source]

Parse the MeSH topic tree from a stream of MeSH RDF N-tuples.

Parameters

nt_stream – A text stream of MeSH RDF N-tuples. This is intended to work with the content of the MeSH files downloaded from the following website: https://nlmpubs.nlm.nih.gov/projects/mesh/rdf

Returns

A dictionary representing the parsed MeSH topic tree. The keys are the tree numbers that uniquely identify a topic. The values are the corresponding topic labels. Note that the topic labels are not unique. For example, the two tree numbers F04.096.628.255.500 and H01.158.610.030 have both the same label “Cognitive Neuroscience”.

Return type

dict[str, str]

resolve_parents(topics: Iterable[str], mesh_tree: MeSHTree) set[str][source]

Enhance the topic list by parents of all given topics.

Parameters
  • topics – A collection of MeSH topics.

  • mesh_tree – An instance of MeSHTree.

Returns

A set with the input topics and all their parent topics.

Return type

set[str]