List current issues with api design from JS client

List some edge cases in which the api falls short when using with https://forge.extranet.logilab.fr/cubicweb/cubicwebjs/. @fbessou @famarger

Sections:

Relations with multiple objects/subjects
Issues with the DataProvider interface
Issues with the current DataProvider implementation
Issues with the python API

Issues with the Schema abstraction classes

Relations with multiple objects or subjects

Issue

The current schema has one RelationDefinitionSchema instance for each subject-relation-object triplet. This is fine when working with that specific triplet, but becomes complex when trying to work on the relation type itself without specific subject/object.

An example is when trying to create a field to edit the relation type allowing the user to input any object type. Developers have to handle several instances with exactly the same values but the object type. It reduces DX and may mislead the developer into thinking other values such as the cardinality may change.

Proposal

Create a higher level class for the relation type independently of any subject/object combination, similar to what is done in Python.

Schema entity and relation getters create new classes

Issue

When using Schema.getEntity, Schema.entities, Schema.relationsDefinitions, Schema.matchRelationDefinitions and equivalent in EntitySchema and RelationDefinitionSchema, a new object instance is created each time. While not creating object instances for all items in the schema help performance on large schema, it complicates writing react hooks using such values.

As the instances are created on each call, if you call such method inside a react function the returned value will be considered different in hook dependency list. We must manually wrap the value in useMemo to prevent hooks from running on each render.

See https://forge.extranet.logilab.fr/cubicweb/cubicwebjs/-/blob/branch/default/packages/react-form-utils/src/hooks/useGetCubicWebEntity.ts?ref_type=heads#L85 for an example.

Proposal

None so far :/

Issues with the DataProvider interface

The DataProvider interface was designed by the react-admin team to fulfill needs specific to react-admin.

Issues

Naming

There are two major naming differences between CubicWeb domain and React-Admin domain:

CubicWeb Entities are mapped to React-Admin Resources
in CubicWeb, entities unique ids are conventionnaly named eid while React-Admin talks about id.

Typing

The DataProvider type is generic enough to support all type of data. But this genericity make usage of the API error prone:

Many methods parameters and return values contain values typed as any (e.g. the filter param)
identifiers are typed as string | number where we would want to only allow number (see Identifier type)
The interface isn't made to allow dependent type (i.e. the type of methods return value can't be dependent of the value given as parameter)
- note that the DataProvider methods are generic over the returned data type. But this type must be specified at call site:
```
    const users = getList<CWUser>("CWUser")
```
  nothing prevents from forgetting to precise a type (in which case the data is considered to be of type RaRecord). An incorrect type could also be given without the compiler noticing any mistake:
```
   const users = getList<{id: string, lgin/*instead of login*/: string}>("CWUser")
```

Features

Sorting can be made only on a single field => Can't sort on both lastname and firstname without introducing convention on the sort parameter:
```
  sort: {
      field: 'title,sequence', order: 'ASC,ASC'
  },
```
The way to tell which attributes and relations must be included in query isn't specified by the DataProvider interface. To allow features such as this one, the DataProvider methods accept a (meta)[https://github.com/marmelab/react-admin/blob/b5373de077469617d524e4e533eeea68cda98131/packages/ra-core/src/types.ts#L143] option. This option type is declared as any to support several implementations.
To update an entity we have to call an updateOne method with both the previous and newer version of the entity. There are no methods to add or remove a relation. It might seem reasonable as it reduce the API surface but it can cause performance problems when managing entities with lot of relations. In this case we have to load all relations.

Proposal

Move away from the data provider interface and only use it when dealing with react-admin. See the roadmap in #750 (comment 137834).

Issues with the current DataProvider implementation

Get after create/update

Issue

When updating/creating an entity, the data provider will fetch the updated entity and return the result. But the user has no control over this fetch. This causes several issues: The user may use parameters when using getOne which will not be reflected here. If they then use the return value of the update/create method as truth for their application, it can break their state as it may have a different set of relations.

Also in cases where the page is refreshed/changed after the update/create, then this additional fetch is useless and impacts performance as the values will never be used.

Proposal

Do not fetch the entity after an update/create. If it is absolutely necessary to fetch the entity after the update, then the user can manually do it from their app.

Model for relations

Issue

In CubicWeb relations are directional associations between a subject and an object.

With the current implementation of the DataProvider interface, relations fetched as part of entities are stored in properties named TypeOfSubject__relation_type__TypeOfObject. For example if we want to access the email address of a CWUser we have to type myUser.CWUser__email_address__EmailAddress[0].address (in the base CubicWeb base schema, email address are entities). This naming was adopted to allow differentiating relations between different targets. For example, we could have a schema containing a relation is_in with the following relation definitions: Museum -- is_in --> City and City -- is_in --> Country.

Proposal

Another approach that would be more readable would be to always store the type of an entity in its corresponding data structure:

museums = [{
    eid: 1337,
    name: "Potato Museum"
    is_in: [{eid: 123, type: "City"}]
}]

// or

cities = [{
    eid: 123,
    name: "Potato City",
    reverse_is_in: [{eid: 1337, type: "Museum"}]
}]

How can we handle cases where we did not fetch relation values first (eg: there are too many)?

idea: Store fetched values (if any) in a current key, and store added/removed values in add and remove keys.

museums = [{
    eid: 1337,
    name: "Potato Museum"
    is_in: {
        current: [{eid: 123, type: "City"}]
        add: [{eid: 5, type: "City"}]
        remove: [{eid: 123, type: "City"}]
    }
}]

Type inference

In some cases when the yams schema is too large, typing inference breaks. It severely impacts the DX and increases compilation times.

RQL generation

Issues

The data provider is responsible for generating RQL queries for simple tasks such as getting/updating/creating one or several entities. This works but has several drawbacks:

Very hard to maintain. There are so many possibilities when generating RQL that it is very complex to check all cases.
One more layer on top of the client/schema libs. The data provider was created in the first place to interface with react-admin. It uses a common interface for any backend. This should not be necessary when not using react-admin and the api should provide similar features out of the box. Ideally the data provider should only exist to translate the api interface into the react-admin interface, not to do the heavy lifting.

Proposal with transaction helpers

Create as many functions as necessary to manipulate entities, these functions operate on Transactions and return references usable in next queries or to retrieve results. The references can point to result sets, rows or values.

Example:

const transaction = new Transaction()
const transactionHelper = new TransactionHelper(schema, transaction)

// On crée une personne appelée "John"
const {eidRef: johnEidRef} = transactionHelper.create(/*transaction,*/{etype: "Person", name: "John", age: 12})

// On récupère les données de l'entité
const {resolveEntity: resolveJohn, attributesRefs: {name: nameRef}} = transactionHelper.getEntity(/*transaction,*/ {eid: johnEidRef})
// On récupère les données de toutes les personnes appelées "John"
const {resolveEntities, resolveCount} = transactionHelper.getEntities(/*transaction,*/ {etype: "Person", filter: {name: nameRef}})
// On récupère les personnes qui connaissent John
const {resolveEntities, resolveCount} = transactionHelper.getEntities(/*transaction,*/ {etype: "Person", filter: {knows: johnEidRef}})
// On exécute la transaction
const result = client.executeTransaction(transaction)

const john = resolveJohn(result)

console.log(john)


class TransactionHelper {

  constructor(private schema: Schema, transaction: Transaction) {
  }
  
  create(/*transaction,*/ {etype, data}) {
    
  }

  ...
}

Proposal using restful API - DISMISSED

Update the api cube to provide new endpoints for classic use cases found in the data provider. These use cases are a lot easier to implement and maintain from the backend. These new endpoints should provide the following features:

get one entity. Users should be able to select which relations to fetch. The returned data should be in JSON format.
update one entity. Same as get, but for updating an entity.
create one entity. Same as update, but for creating an entity.
get many entities. Same as get, but can pass multiple eid to retrieve multiple entities.
get a list of entities. Like get many, but you only provide the type of the entity and not the eid. The result should be paginated and you should be able to sort using multiple criteria

The js client should then implement those new endpoints, removing the need for the data-provider in non-react-admin apps. it would also be a lot easier to support the same features in different client languages.

Issues with the python API

Possible relation targets

Issue

When using the JS libraries with the python API, it is not currently possible to reliably get the list of possible entities for a relation (subject or object) while respecting all RQL constraints and cardinality when creating.

We can use the RQL entity method linkable_entities, but it currently has several issues:

It does not work for object relations.
When creating new entities (eid unknown), we can't pass known values so the method can properly compute constraints.
As it is an entity method, it is executed as many times as there are entities (which can be 0 when we are creating the first one).
There is no pagination
The returned value is not a standard RQL result row
The entity method must validate its parameters manually
Does not return entities already related to the current one if cardinality is 1 or ?. You must then make an additional request to fetch information about the entities already related.

Proposal

No matter if we keep an entity method or choose another solution, we can get inspiration on the choices() method in cubicweb_web.

We can create a specific endpoint for that (see #831).

Pros

Can use openapi to validate parameters
Not limited to the RQL rows format as return value
Is only called once no matter the number of entities existing

Cons

Create a new endpoint

If we fix the linkable_entity method using logic in the choices() method, we can move it to the entity type instead of the instance.

Pros

Is only called once no matter the number of entities existing
Does not create a new endpoint

Cons

Must validate manually parameters
Must return a format different than RQL row which breaks client typing

Permissions

Issue

The python api currently does not expose permissions. If an RQL request asks for a relation the user has no permission for, the whole request is rejected. As there is no client knowledge of permissions, clients cannot build specific UIs for different user roles and select which relations to fetch.

Proposal

Build a new route to let clients fetch permissions and the current user role.

Validation/error handling

Issue

Validating RQL constraints locally is not optimal as it needs one request for each relation to check. It is easier and more performant to send the data to the backend and wait for an answer. But validation errors use a non standard format which clients must parse in order to display errors to the client.

Proposal

Convert ValidationError messages to JSON to make it easier for clients to get the information

Edited Jan 17, 2024 by Arnaud Vergnet

To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information