The Avial model was created to provide a way to store data in entities.
The choice of which data structure to use to represent knowledge comes from the following tradeoff:
The ideal data format strikes the right balance between performance and expressivity.
Everything you need to know about an entity should be stored in the entity itself. Individual things should be represented as individual entities so we can individually point to it with an entity ID.
This is what the Avial Model strives to be.
Properties are collections (i.e. a list) of name/[value](./Orchestra Value) pairs, each with an optional key field, that can be associated with an entity. Within a single entity, names can be any arbitrary text string of acceptable length (i.e. < 256 UTF-8 bytes). Keys similarly are character strings, however, they must be unique across all properties within an entity’s property list.
Each property in a property list may have zero or more annotations. An annotation is an attribute/[value](./Orchestra Value) pair. Attributes are defined by the Orchestra attribute taxonomy; an extensible enumerated list, similar to the category, context, and class taxonomies. All annotations for a specific property must have a unique attribute unless the attribute is null. Properties with annotations are a very convenient and efficient knowledge construct. With a single network call, an entire annotated property list can be efficiently retrieved. Properties are particularly useful for creating registries where fast look-up is required.
Facts provide a means for capturing and representing a wide variety of complex semantic structures including tuples, graphs, hypergraphs, and ultragraphs. A fact is an attribute/[value](./Orchestra Value) pair where each attribute, if not null, must be unique. Valid attributes are again dictated by the attribute taxonomy. Typically, whenever you have knowledge about an entity, you want to store it in a fact.
Each fact can have zero or more facets, features, fields, and/or frames.
It is important to note that this structure of data does not impose any semantic. There is no definitive rules about where in an Orchestra Model should we store what. Defining how we use the Orchestra model for an entity is called modeling.
Facets are used when you want to represent multiple named instances of a fact attribute. They are a name/[value](./Orchestra Value) where each name, if not null, must be unique. Every facet can also have factors, which are the equivalent of features but are associated with that specific facet versus the fact itself. Example I may have different addresses, my work address, my home address etc... I can have an ADDRESS fact with a "work" and a "home" facet.
Factors are like features, but for a facet instead of a fact. Both my "work" address and my "home" address may have the same Country, City, Zip code, Streen name, Streen number and appartment factors.
Every aspect of an Orchestra model have a unique identifier:
For every aspect, that unique identifier may be NULL (NULL attribute for a fact, empty string for keys and names). In that case, there might be multiple of that aspect, and they are accessed by index.
The exact index to use to access what is very weird, and is explained further down in this page.
In general, it is preffered to only identify aspects by index if the ordering conveys some semantic value (like the order of events happening). In the majority of cases, have the unique identifier be a real value, in practice, having unique key makes everything easier.
The Attribute aspect is new in Avial v5 and to this date (2024-08-21) not yet supported by all our avial libraries. It is intended to be used to keep track of things about the knowledge itself, for example the provenance of that knowledge, the date it was last updated at, who updated what, when should it be considered stale etc...
Each Attribute can have a single value and any number of traits.
Traits of Attributes are exactly like Features of Facts, they are a collection of name/key/value where the key must be unique if not null.
This section is work in progress, and is an unstructured list of lesson learned and things to consider when modeling.
First and maybe most importantly, do not START by asking how to model the thing with the Avial Model. First, model the thing with whatever tool or language you want. Think about what are the things you care about, how they relate to each other, how are you gonna use that knowledge, what question will you ask to that knowledge, ect... Once you have a description of what you want modeled, then you can figure a way to express it with the Avial Model construct. The important point is that the Avial Model can be very confusing, but you should not get lost in the detail of how to model it before having the big picture of what you want to model and why.
Sometime, it's hard to know whether a certain concept should be captured as a separate entity or as part of an existing entity. Eg Should the address of a Person be an entity? Or should it just be a fact in the Person?
Here are a few questions to help you decide:
Every answer should be a nuance between yes and no, and it ultimately only depends on the goal you want to achieve. However the more the answers to these questions is yes, the more likely representing the concept as a separate entity is the better choice.
Links should generally be bi-directionnal. That means that if an entity A references an entity B, the entity B should reference the
entity A. This is to ensure we can know everything that references entity A. This is needed in some cases, like if we need to delete entity A, we need to ensure nothing references it.
When referencing an entity, the reference hold three kind of informations:
There are five standard adapters that provide the model the ability to store an Orchestra model. These adapters are implemented using precisely the same software. However, to aid performance and computational load balancing, each adapter is configured differently, placing restrictions on which model aspects are supported. Among an adapter’s configuration parameters are the ability to enable or disable the main property and fact model aspect groupings, and whether or now the adapter supports data storage (i.e. file read/write). While the registries and objects adapters are restricted to just property-related aspects and fact-related aspects, the general adapter is configured with no restrictions.
Adapter | oultet | Property | Fact | Data |
---|---|---|---|---|
Registries | <10> | X | ||
Objects | <11> | X | ||
Folders | <12> | X | X | |
Files | <13> | X | X | |
General | <14> | X | X | X |
To create an entity with an Orchestra model, pick which adapter you want to use and issue an Invoke: Create
on it.
Standard Orchestra library usually have the functions Create_Registry
, Create_Object
, Create_Folder
, Create_File
and Create_General
for ease of use.
Note that it is possible to first create an entity, and then connect it manually to one of these 5 adapters yourself, but this is not good practice as it can create issues when using the automatic reference counting to automatically garbage collect unreferenced entities.
Even though it is possible to manually delete entities by calling Delete_Entity
(or Delete_Registry
, Delete_Object
etc... they all do the same), it is not the advised way to do it.
The advised way is to rely on the automatic reference counting provided by the Registries/Objects/Folders/General adapters. The way it works is that whenever an entity is referenced in any value field of the Orchestra model (any [value](./Orchestra Value) of Tag Entity, or any value inside a List, Aggregate or Variable), the reference count (a field in the metadata) is increased by one. Whenever a reference is removed, the reference count is decreased by one.
Periodically, the server will automatically garbage-collect entities with a 0
reference count and call Delete_Entity
on them.
Therefore the only thing you need to do to properly manage the lifetime of your entities is to ensure they are referenced by something. A common practice is to have one known 'root' registry with a reference to itself (so it keeps itself alive), then create a hierarchy of registries from there (similar to folders on a file system), and register all your entities somewhere in this hierarchy. Usually, we have one registry per 'type' of entity (entities all representing the same kind of thing in the world the knowledge space represents), which also gives us the ability to iterate over all entities of the same 'type'. This is useful for a variety of tasks such as debugging, or model migration.
Can be modified through various standard Orchestra utility functions, which are all wrappers around INVOKE calls to the object adapter.
When 0
is given but multiple values exist, it targets the last value by default.
BE CAREFUL: The indexation system is 1-based !!! That means the first thing is at index 1. index 0 represent no index
Each aspect use different names for its indexation system. It is pretty weird and inconsistent
index
is the index of a propertyinstance
is the index of a propertyindex
is the index of an attributeNote: It appears that the avu has a bug where it only displays the last value of the annotation Attribute = NULL
index
is the index of a Fact.instance
is the index of a Fact.index
is the index of a Facet.offset
is the index of a Fact.instance
is the index of a Facet.index
is the index of a Factor.instance
is the index of a Fact.index
is the index of a Feature.instance
is the index of a Fact.index
is the index of a Field.instance
is supposedly for the Fact, according to error message, however a frame cannot be added on a NULL Attribute (if you figure out how to create a frame on a NULL attribute, send a message to @yanis-fourel, he gets an error no matter what he tries).offset
is the index of a Field.index
is the index of a Frame.Function | Description |
---|---|
Insert | Inserts a [value](./Orchestra Value) at the given index or key, pushing any element after it one index further. Error if given key already exists |
Remove | Removes given index. Error if the index does not exist |
Replace | Replaces given index or key. Error if the index does not exist |
Find | Starting from the optionally given index, get the next index whose [value](./Orchestra Value) is equal to the one we searched. Return 0 if does not exist. In the case of Features, Find searches both either a value or a name. If both value and name are given, it searches for the value first, and if the value isn't found then searches for the name. In the case of Properties, Find searches for either a value or a name. If the name is NULL, it just returns 0 for some reason. That's probably a bug, if you want to search for a value, just pass in any name, "foo" is good enough, but you have to pass in something otherwise it will just return 0. If both value and name are given, it searches for the value first, and if the value isn't found then searches for the name. Raises an error if a parent of the element requested does not exist |
Include | Create/Update a [value](./Orchestra Value) by key. For the facets, the name is considered to be the key. Error if a parent of the element requested does not exist. Eg: try to include a facet in a fact that doesn't exist, or a factor in a facet that doesn't exist. You can use the Set operation instead to automatically create the parents if they don't exist. Error if given key is null |
Exclude | Remove a [value](./Orchestra Value) by key. For the facets, the name is considered to be the key. Error if a parent of the element requested does not exist. Eg: try to include a facet in a fact that doesn't exist, or a factor in a facet that doesn't exist. You can use the Set operation instead to automatically create the parents if they don't exist |
Set | Same as Include, except that it doesn't fail if a parent of the element requested does not exist, it instead creates the hierarchy before setting the thing we are setting. |
Get | Get a [value](./Orchestra Value) by key. Return a [value](./Orchestra Value) of tag NULL if the key does not exist, or if a parent of the element requested does not exist |
Clear | Same as Exclude, except that it doesn't raise an error if we clear something in a fact/facet that doesn't exist. It just doesn't have any effect. |
Count | Get the number of properties/facts/facets/factors/features/frames/fields |
Member | True if the key exists, false if it doesn't. However keep in mind that by the time the response comes back to you, somebody else might be modifying the model and the response might already be outdated |
Name | Get the name of the thing. For facts, name is always empty string. |
Key | Get the key of the thing. For Facts and Facets, key is always empty string. |
Value | Same as GET, but raises an error if a parent of the element requested does not exist. |
Index | Get index from key. Raises an error if a parent of the element requested does not exist |
Attribute | Get the attribute from the index of its fact. You probably only ever need to call that at a fact level, no need to use the Attribute_Facet, factor, feature, or whatever, it's all gonna be the same. For properties, the attribute is always NULL_ATTRIBUTE |
Purge | Removes all elements from the targeted list |
Sort | Sort keys alphabetically. That does not seem to work if the list has both keyed and unkeyed elements |
Retrieve | I guess it's supposed to give you a JSON of the content of the collection you want, but I can't get it to work |
If you want to add a new entry to a registry without the risk of a race condition, you can use the fact that insert_property
will fail if a property already exists. This way, only the very first caller insert_property
will succeed, making the operation atomic.
Here is an example implementation in Python
def get_or_create_object(registry: AvEntity, key: str, name: str | None = None) -> Tuple[bool, AvEntity]:
"""
returns (True, entity) if entity was created by this thread
returns (False, entity) if entity already existed or was just created by another thread
"""
try:
return (False, lookup_registry(registry, key, authorization = Config.auth))
except:
pass
new_object_entity = create_object(authorization = Config.auth)
try:
insert_property(
entity = registry,
name = name if name is not None else key,
key = key,
value = AvValue.encode_entity(new_object_entity),
authorization = Config.auth
)
return (True, new_object_entity)
except ApplicationError as e:
"""
In case of race condition, more than one thread can get past the first
try block. In that case, the second thread will fail to insert the
entity.
Calling `lookup_registry` again will ensure that we return the same
entity in all the threads.
"""
if 'property already exists' in e.message:
return (False, lookup_registry(registry, key, authorization = Config.auth))
raise