Code¶
The code is organised into packages, in the standard django way.
The following documentation is incomplete (work in progress), for the timebeing it’s better to reffer to the actual sources.
Package: disco_service¶
This is a django project, containing the usual settings.py, urls.py and wsgi.py
Note
Also contains celery.py, which is configuration for async worker nodes
Package: crawler¶
This django app is a simple wrapper.
crawler app does not have an admin interface.
crawler.models¶
An ORM interface to the DB which is shared with the disco_crawler node.js app.
-
class
crawler.models.
WebDocument
(*args, **kwargs)[source]¶ Resource downloaded by the disco_crawler node.js app.
The document attribute is a copy of the resource which was downloaded.
url uniquely defines the resource (there is no numeric primary key). host, path, port and protocol are attributes about the HTTP request used to retrieve the resource. lastfetchdatetime and nextfetchdatetime are heuristically determined and drive the behavior of the crawler. _hash is indexed and has a coresponding attribute in the metadata.Resource class (these are compared to determine if the metadata is dirty).
The rest of the attributes are derived from the content of the document.
Package: metadata¶
This django app manages the content metadata.
metadata.models¶
-
class
metadata.models.
Resource
(*args, **kwargs)[source]¶ ORM class wrapping persistent data of the web resource
Contains hooks into the code for resource processing
Package: govservices¶
This app wraps public data about government services.
govservices.models¶
-
class
govservices.models.
Agency
(id, acronym)[source]¶ -
exception
DoesNotExist
¶
-
exception
Agency.
MultipleObjectsReturned
¶
-
Agency.
dimension_set
¶
-
Agency.
objects
= <django.db.models.manager.Manager object>¶
-
Agency.
service_set
¶
-
Agency.
subservice_set
¶
-
exception
-
class
govservices.models.
SubService
(id, cat_id, desc, name, info_url, primary_audience, agency)[source]¶ -
exception
DoesNotExist
¶
-
exception
SubService.
MultipleObjectsReturned
¶
-
SubService.
agency
¶
-
SubService.
objects
= <django.db.models.manager.Manager object>¶
-
exception
-
class
govservices.models.
ServiceTag
(id, label)[source]¶ -
exception
DoesNotExist
¶
-
exception
ServiceTag.
MultipleObjectsReturned
¶
-
ServiceTag.
objects
= <django.db.models.manager.Manager object>¶
-
ServiceTag.
service_set
¶
-
exception
-
class
govservices.models.
LifeEvent
(id, label)[source]¶ -
exception
DoesNotExist
¶
-
exception
LifeEvent.
MultipleObjectsReturned
¶
-
LifeEvent.
objects
= <django.db.models.manager.Manager object>¶
-
LifeEvent.
service_set
¶
-
exception
-
class
govservices.models.
ServiceType
(id, label)[source]¶ -
exception
DoesNotExist
¶
-
exception
ServiceType.
MultipleObjectsReturned
¶
-
ServiceType.
objects
= <django.db.models.manager.Manager object>¶
-
ServiceType.
service_set
¶
-
exception
-
class
govservices.models.
Service
(id, src_id, agency, old_src_id, json_filename, info_url, name, acronym, tagline, primary_audience, analytics_available, incidental, secondary, src_type, description, comment, current, org_acronym)[source]¶ -
exception
DoesNotExist
¶
-
exception
Service.
MultipleObjectsReturned
¶
-
Service.
agency
¶
-
Service.
life_events
¶
-
Service.
objects
= <django.db.models.manager.Manager object>¶
-
Service.
service_types
¶
-
exception
govservices.tests¶
Suite of tests assuring that the code which manipulates govservices is working correctly.
govservices.management.commands.update_servicecatalogue¶
It would be highly preferable to refactor this to use a REST API to interrogate the service catalogue, rather than messing about with the ServiceJsonRepository.
-
class
govservices.management.commands.update_servicecatalogue.
Command
(stdout=None, stderr=None, no_color=False)[source]¶ manage.py extension. Call with:
python manage.py update_servicecatalogueor:
python manage.py update_servicecatalogue <entity>where <entity> is the name of one of the classes in metadata.models