ObservationSet and FeatureSets

`ObservationSet` and `FeatureSet`

We can compose unique sets of Observation and Feature instances into ObservationSet and FeatureSet instances, respectively. These data structures can be used as a way to subset/filter samples/observations or to provide groups of samples to analyses (such as when comparing two groups for a differential expression analysis).

class data_structures.element_set.BaseElementSet(val, **kwargs)

A BaseElementSet is a collection of unique BaseElement instances (more likely the derived children classes) and is typically used as a metadata data structure attached to some "real" data. For instance, given a matrix of gene expressions, the ObservationSet (a child of BaseElementSet) is the set of samples that were assayed.

We depend on the native python set data structure and appropriately hashable/comparable BaseElement instances (and children).

This essentially copies most of the functionality of the native set class, simply passing through the operations, but includes some additional members specific to our application.

Notably, we disallow (i.e. raise exceptions) if there are attempts to create duplicate BaseElements, in contrast to native sets which silently ignore duplicate elements.

A serialized representation (where the concrete BaseElement type is Observation) would look like:

{
    "elements": [
        <Observation>,
        <Observation>,
        ...
    ]
}

class data_structures.observation_set.ObservationSet(val, **kwargs)

An ObservationSet is a collection of unique Observation instances and is typically used as a metadata data structure attached to some "real" data. For instance, given a matrix of gene expressions, the ObservationSet is the set of samples that were assayed.

We depend on the native python set data structure and appropriately hashable/comparable Observation instances.

This essentially copies most of the functionality of the native set class, simply passing through the operations, but includes some additional members specific to our application.

Notably, we disallow (i.e. raise exceptions) if there are attempts to create duplicate Observations, in contrast to native sets which silently ignore duplicate elements.

A serialized representation would look like:

{
    "elements": [
        <Observation>,
        <Observation>,
        ...
    ]
}

class data_structures.feature_set.FeatureSet(val, **kwargs)

A FeatureSet is a collection of unique Feature instances and is typically used as a metadata data structure attached to some "real" data. For instance, given a matrix of gene expressions, the FeatureSet is the set of genes.

We depend on the native python set data structure and appropriately hashable/comparable Feature instances.

This essentially copies most of the functionality of the native set class, simply passing through the operations, but includes some additional members specific to our application.

Notably, we disallow (i.e. raise exceptions) if there are attempts to create duplicate Features, in contrast to native sets which silently ignore duplicate elements.

A serialized representation would look like:

{
    "elements": [
        <Feature>,
        <Feature>,
        ...
    ]
}