How to get a list of a channel's imported nodes

Summary

kolibri manage listchannels lists imported channels
I would like a utility that would list the nodes imported for a given channel
The goal is to replicate an installation across time and space by providing a list of channels and nodes to another user.
The focus is on public channels.

Technical details

I would like the reverse of kolibri manage importcontent --node_ids <ids> network <channel id>

That is, for a given channel id it returns a list of completely imported nodes such that if that list were supplied as an argument in the above command the same content would be imported.

So the first question is whether such a function already exists in the code base.

The second set of questions relates to trying to write such a function.

Looking through the database I do not see a column that seems to indicate that a particular content file has been imported. I thought perhaps available in content_localfile but it always seems to be 0, so is it necessary to check the file system to see if a content item is there (and possibly whether it has the correct file size)?

The MPTT schema is a little tough to understand, but would I be right in thinking that if I select all topic nodes with rght == lft +1 that these are leaves and could be searched for downloaded content? There is probably a function somewhere that would get me this list.

Any help or advice appreciated.

Hi @Tim,

I don’t believe there’s an existing command for this functionality. In case it’s helpful and to provide additional background, there are two aspects of content import: the import of the metadata and the import of the actual content files. The metadata is stored in the database, like you found with content_localfile. If the content files have been imported, the available field should be truthy, so I’m unsure why you’ve observed that it is always 0. You’re using Kolibri 0.15?

In the Kolibri shell (kolibri shell), the following code will produce the necessary node IDs you’re looking for:

from kolibri.core.content.models import ContentNode

node_ids = ContentNode.objects.filter(available=True, channel_id="<CHANNEL_ID_HERE>").get_ancestors(include_self=True).values_list("id", flat=True)

At the ContentNode level, the available field is truthy if all necessary non-supplementary files are imported.

Also, we have a new feature that might interest you coming with the release of Kolibri 0.16, which is currently in development. The optional feature will enable automatic content import of individual resources assigned in classes, like lessons and quizzes, upon successful sync of a learning facility. The feature will by enabled by default for learn-only devices.

Regards,
Blaine

I don’t think the available field is truthy. Here is the partial import of a channel from a local url
http://192.168.3.162/kolibri/en/device/#/content/channels/12cee68c112452a1be3f73e730ec2114?node_id=d95cbfd3f1b9555e9713cd7ef3603cfc

sqlite> select kind, level, title, tree_id, cn.available, cf.available, lf.available from content_contentnode cn
inner join content_file cf on cn.id = cf.contentnode_id
inner join content_localfile lf on lf.id = cf.local_file_id;

kind      level  title                                           tree_id  available  available  available
--------  -----  ----------------------------------------------  -------  ---------  ---------  ---------
topic     1      English                                         1869672  1          1          0
document  1      Afrikaans                                       1869672  1          1          0
document  1      Afrikaans                                       1869672  1          1          0
document  1      العربية                                         1869672  1          1          0
document  1      العربية                                         1869672  1          1          0
document  1      Français, langue française                      1869672  1          1          0
document  1      Français, langue française                      1869672  1          1          0
document  1      हिन्दी, हिंदी                                   1869672  1          1          0
document  1      हिन्दी, हिंदी                                   1869672  1          1          0
document  1      isiXhosa                                        1869672  1          1          0
document  1      isiXhosa                                        1869672  1          1          0
document  1      isiZulu                                         1869672  1          1          0
document  1      isiZulu                                         1869672  1          1          0
document  1      Kiswahili                                       1869672  1          1          0
document  1      Kiswahili                                       1869672  1          1          0
document  1      中国大陆                                            1869672  1          1          0
document  1      中国大陆                                            1869672  1          1          0
document  1      漢語 (繁體字)                                        1869672  1          1          0
document  1      漢語 (繁體字)                                        1869672  1          1          0
document  1      Português                                       1869672  1          1          0
document  1      Português                                       1869672  1          1          0
document  1      Setswana                                        1869672  1          1          0
document  1      Setswana                                        1869672  1          1          0
document  1      Español                                         1869672  1          1          0
document  1      Español                                         1869672  1          1          0
document  1      Tetun                                           1869672  1          1          0
document  1      Tetun                                           1869672  1          1          0
topic     2      Hand Washing                                    1869672  1          1          0
topic     2      Physical Distancing                             1869672  1          1          0
topic     2      Management of Respiratory Secretions            1869672  1          1          0
topic     2      Other Strategies to Prevent Spread at Home      1869672  1          1          0
topic     2      Stanford Medicine Animated Series               1869672  1          1          0
video     3      Hand Washing 20 Seconds                         1869672  1          1          0
video     3      Hand Washing 20 Seconds                         1869672  1          1          0
video     3      Hand Washing Carousel                           1869672  1          1          0
video     3      Hand Washing Carousel                           1869672  1          1          0
video     3      Correct Hand Washing                            1869672  1          1          0
video     3      Correct Hand Washing                            1869672  1          1          0
document  3      Hand Washing                                    1869672  1          1          0
document  3      Hand Washing                                    1869672  1          1          0
video     3      Why Physical Distancing Works                   1869672  1          1          0
video     3      Why Physical Distancing Works                   1869672  1          1          0
video     3      Why Physical Distancing Works (Full Animation)  1869672  1          1          0
video     3      Why Physical Distancing Works (Full Animation)  1869672  1          1          0
document  3      Physical Distancing                             1869672  1          1          0
document  3      Physical Distancing                             1869672  1          1          0
video     3      Management of Respiratory Secretions            1869672  1          1          0
video     3      Management of Respiratory Secretions            1869672  1          1          0
document  3      Management of Respiratory Secretions            1869672  1          1          0
document  3      Management of Respiratory Secretions            1869672  1          1          0
video     3      Stay Home if Unwell                             1869672  1          1          0
video     3      Stay Home if Unwell                             1869672  1          1          0
video     3      Sharing Food Safely                             1869672  1          1          0
video     3      Sharing Food Safely                             1869672  1          1          0
document  3      Other Strategies to Prevent Spread at Home      1869672  1          1          0
document  3      Other Strategies to Prevent Spread at Home      1869672  1          1          0
video     3      Global COVID-19 Prevention                      1869672  1          1          0
video     3      Global COVID-19 Prevention                      1869672  1          1          0
video     3      Staying Safe When COVID-19 Strikes              1869672  1          1          0
video     3      Staying Safe When COVID-19 Strikes              1869672  1          1          0

I would not expect available to be the same in all rows.

Hi @Tim,

The content_contentnode.available should be accurate for this purpose. The discrepancy between the screenshot and the SQL query is confusing though.

Again, what version of Kolibri are you running? The field content_file.available appears to have been removed in version 0.13

-Blaine

Another question: what database file have you opened in sqlite? If that’s the channel database, then everything would be shown as available.

Device info

Server URL http://192.168.3.162:8009/
Free disk space 86 GB
Kolibri version 0.15.12
Device name iiab-kolibri-162 (5732) Edit

Advanced

This information may be helpful for troubleshooting or error reporting

Show

It is a channel database I think:

/library/kolibri/content/databases/12cee68c112452a1be3f73e730ec2114.sqlite3

where should I be looking?

I opened /library/kolibri/db.sqlite3

select title from content_contentnode where available = 1;

Stanford Digital MEdIC Coronavirus Toolkit
English
Hand Washing
Physical Distancing
Hand Washing 20 Seconds
Hand Washing Carousel
Correct Hand Washing
Hand Washing
Why Physical Distancing Works
Why Physical Distancing Works (Full Animation)
Physical Distancing

this looks better

But it appears available is 1 if any of content below has been imported.

I imported ES Khan Introducción a las fracciones and Qué significan las fracciones and got the following list with the above query

Khan Academy (Español)
Matemáticas [incomplete parent]
Aritmética [incomplete parent]
Entiende fracciones [incomplete parent]
Introducción a las fracciones
Introducción a las fracciones
Cortar figuras en partes iguales
Corta figuras en partes iguales
Qué significan las fracciones
Identificar numeradores y denominadores
Comprende los numeradores y denominadores
Reconoce fracciones
Reconoce fracciones
Reconocer fracciones mayores que 1
Reconocer fracciones mayores que 1

Hi @Tim,

Yes, you should query db.sqlite3. The other database is where the channel metadata is imported from, so it would likely always indicate content is available.

There’s one other feature which is coming in Kolibri 0.16 and I think you will find useful here :slight_smile:

In Kolibri 0.16, by default, the kolibri manage exportcontent command will generate a file in the output directory named “manifest.json”. So if I run kolibri manage exportcontent e409b964366a59219c148f2aaa741f43 ./kolibri_export, Kolibri will create a file named ./kolibri_export/content/manifest.json. That file describes the specific content which has been exported to this particular directory. In my case, it looks like this:

{
    "channels": [
        {
            "id": "e409b964366a59219c148f2aaa741f43",
            "version": 10,
            "include_node_ids": [
                "6277aa0c44235435acdc8a9ed98f466b"
            ]
        }
    ],
    "channel_list_hash": "bb57509137cc800fa31444d81a7a5d17"
}

There are a few things we can do with this file using the importcontent command.

First, you can import content from the directory without specifying node IDs. The command detects if the content directory includes a manifest file and uses that, which can avoid some mishaps:

kolibri manage importcontent disk e409b964366a59219c148f2aaa741f43 ./kolibri_export

(You can disable this functionality and use the previous behaviour with kolibri manage importcontent disk ./kolibri_export --no_detect_manifest).

You can also import content from a network source using the manifest file. The manifest file doesn’t need to be in the same directory as the content, so if your devices all have network connections, you could copy around just that:

kolibri manage importcontent --manifest=./kolibri_export/content/manifest.json network e409b964366a59219c148f2aaa741f43

Note that the importcontent command still requires that you specify a channel to import. And at the moment, there isn’t a way to generate a manifest file without exporting content at the same time. I have been working on some functionality which connects these pieces a bit more, so it would be interesting to know if it is useful.

If you’re wondering how that is implemented, these two pull requests have most of the story:

That 0.16 functionality sounds exactly like what I want, except could we have

kolibri manage exportcontent --manifest_only

Internet in a Box has a number of Presets to help installers set up common scenarios for audience and language. These Presets have json files like manifests to define the content and services that need to be installed. The manifest currently includes ZIMs, KA lite topics, various html modules, and OSM tile sets. I’d like to add Kolibri channel/nodes.

Hopefully you have solved the problem I’m struggling with, which is that a node can be marked available when it is only partially imported, so you have to walk the tree to know which nodes are complete to then select the minimum such nodes.

Thanks, @dylan-m!

@Tim Could you clarify? If you’re referring to ancestor nodes of imported resources being marked as available, this is required for display in Kolibri. In 0.15, it isn’t possible to import nodes without the ancestors. In 0.16, some changes will only require the parent node for display within Kolibri.

The importcontent command will automatically deal with this scenario, so for your query you should filter kind != 'topic' to select only the resources for passing to importcontent --node_ids

@blaine I think I am agreeing with you in that I assumed the ‘available’ field is mainly for display. I only observed that that makes calculating the list of fully imported topics more difficult as ‘available’ can not be used. @dylan-m looks to have done this if I read the code correctly.