Each dataset is owned by an “organization”. The SSDC has over a dozen organizations listed. Each organization can have its own workflow and authorizations, allowing it to manage its own publishing process. A large organization can break up its data by department, allowing each department to be a separate organization within the SSDC.
Each organization can have a user assigned as an administrator. An organization’s administrator can add individual users to it, with different roles depending on the level of authorization needed.
A user in an organization can create a dataset owned by that organization. In the default setup, this dataset is initially private, and visible only to other users in the same organization. When it is ready for publication, it can be published at the press of a button. This may require a higher authorization level within the organization.
Only logged-in users who are members of the dataset’s organization can see private datasets. Private datasets are shown in search results to users who have the appropriate permissions.
Datasets can be linked to a group as well as an organization.
There are three ‘types’ or ‘uses’ of the group feature:
Categories are one type of the group feature. Categories are predefined to align with the eight broad topics identified in the Information Management Plan. These eight categories are visible on the front page of the SSDC. These categories will not change.
The remaining groups each represent a Collection of datasets that are relevant to an Organization (formal or functional) that have specific data curation requirements.
In the case of the Skeena Watershed Initiative, the SWI group pulls together in a collection all the reports produced by this initiative. Each report may have a different author and be tied to a different organization, and the collection (SWI group) simply pulls all these reports together under one heading. It creates a ‘mini library’ of the work carried out by the SWI. In this example all the documents are public.
What is the difference between an organization and a group?
The difference is all about authorization. If you’re an editor or admin of an organization, then you can create new datasets in that organization and you can edit the datasets that belong to the organization. If you’re an editor or admin of a group, all you can do is take existing datasets that are already on the site and add them to the group, or remove datasets from the group, you can’t create new datasets or edit the datasets that belong to the group.
Also, organizations can contain private datasets that are only visible to the members of the organization, groups can’t.
Organizations are controlling which users can add, update and publish which datasets. Every dataset in CKAN must belong to exactly one organization. If a user is an editor or admin of an organization, then she can create new datasets in that organization, and can edit and publish the datasets that belong to that organization. For example, think of a national government open data site where each government department has its own organization and manages its own data and users.
Groups on the other hand are about curation – collecting datasets together into groups. If a user is an editor or admin of a group, then they can add datasets (that already exist on the site) into their group and can remove datasets from their group, but they cannot necessarily add new datasets to the site, or edit the datasets that are in their group. Groups are meant to be used by the community of users of a site (the people consuming the data, not the people who’re publishing the datasets on the site) to collect related datasets together into themes like “climate” etc.