Competition document: open-source big data insight
Updated 10 July 2015
1. Open-source big data insight
The Ministry of Defence’s (MOD) Information Systems and Services (ISS) vision is to provide information capabilities by developing closer alignment with industry partners. Through this Centre for Defence Enterprise (CDE) themed competition, ISS is looking for a technology demonstrator to provide an unstructured and open-source data analytics platform for defence.
The total funding available for phase 1 of this competition is £250,000. It’s anticipated that up to £2 million will be made available for phase 2.
This competition was briefed at the CDE Innovation Network event on 24 June 2015 in London. Presentation slides are available. It will also be briefed at a webinar on 17 July 2015.
Proposals must be submitted by 5pm on Thursday 13 August 2015. Submit your proposal using the CDE online portal.
2. Background
MOD ISS recognises that information professionals both within defence and industry will play a critical role in the development and delivery of its vision. ISS intends to build commercial models that support closer working with industry, reduce the red tape and challenges by using agile procurement and bring coherence to the wider defence information capability portfolio. In support of this, MOD’s Chief Digital and Information Officer (CDIO) is holding an information symposium event on 29 and 30 June 2015 and running this themed competition through CDE.
ISS wants this CDE themed competition to help bring together small and medium-sized enterprises (SMEs) and major industry players to develop and demonstrate their open-standard, open-architecture capabilities in the field of open-source data analytics and insight. The aim is to create an evolutionary new analytics platform for defence, in partnership with industry.
The private sector takes all available opportunities from new and improving technologies, in particular the internet, recognising that it’s transformed almost every aspect of modern life. Historically MOD has been cautious of adopting emerging and unproved technologies. However, CDIO recognises that defence needs to adopt these freely available capabilities and services now.
This CDE themed competition seeks to engage with industry, and particularly SMEs that are delivering analytics and digital enablers, to deliver a capability that will become an important part of defence transformation.
3. Technology challenge
ISS is looking for a solution to provide a structured and unstructured data, and open-source data, analytics platform for defence. At a high level, this solution is intended to:
- ingest data from structured and unstructured sources
- read/Optically Character Recognise (OCR) the content of structured and unstructured data
- understand the text-based semantics and structure of text using Natural Language Processing (NLP)
- create data dependencies and relationships, based on configurable rules
- manage and grow a dictionary of terms, keywords and pseudonyms to identify data of interest from the ingested information, using Master Data Management (MDM) processes and techniques
- provide a rules engine for data dependencies, derivation, aggregation and raising events and actions for processing by operators
- provide a configurable and user preference-based visualisation of the data with drill-down to individual data items, and linked records
For phase 1 of this CDE competition, we’re looking for the architecture and components of the solution as a technology demonstrator of a potentially larger solution for defence.
ISS intends to take forward successful outputs of CDE-funded projects from this themed competition for phase-2 funding. See the exploitation section below for further information.
ISS is looking for phase-1 projects lasting for an 8-week period to develop/configure a demonstrator to be delivered by 30 November 2015.
The following diagram shows a high-level component structure for the solution. The red circles indicate the components intended to be developed and presented for the technology demonstrator during the phase-1 CDE themed competition.
The circled components for the technology demonstration are as follows:
- data and information sources that provide adapters or data-exchange facilities to ingest both structured and unstructured information
- data access and data management components to ensure that data ingested is audited, access is controlled, and actions/data updates are recorded and audited, including MDM techniques and processes
- a governance model that enables the configuration of data and access controls, and integration with third party tools (eg reporting, office automation tools and third party portals)
- a data repository that has multiple views from the original raw data, ie derived and aggregated datasets from this raw information
- visualisation capability over the data structures and content, including dashboards, reporting extracts, and visualisation of data structures, data content, dependencies and relationships
The solution will be used for the analysis of defence internal structured and unstructured data to enable MOD to develop new and previously unavailable insights. Here’s a list of potential cases or scenarios:
- data enrichment: the semantic web/linked data (eg people, names, locations and key words)
- social media analysis to evaluate the internal defence population’s opinions on defence and our service: keywords, phrases, people and places
- emotion, opinion and trending analysis: internal defence collaboration portals (eg defence intranets blogs, defence connect, Yammer for defence)
- system reading of unstructured documents and file analytics: documents, files, presentations, spreadsheets, PDFs and scanned images
- ingest and analyse audit files and logs
4. What we want
In this CDE competition at phase 1, we’re looking for proof-of concept technology demonstrations to meet 1 or more of the following challenges:
- Data source integration: this could be from document files, raw files, scanned files and structured data sources. It could be able to connect and replicate data with other data stores or databases using standard industry drivers.
- Data ingestion and understanding: for example this could be the import, parsing/reading and tagging of information from datasets, files and documents.
- Data store: to provide a data store for both structured and unstructured data to support easy access to data, retrieval, tagging and audit.
- Dictionary: the ability to reuse current dictionaries or create new ones, so support the tagging, understanding and contextual search of structured and unstructured data.
- Rules and events: provide a user configurable rules engine to act upon the data.
- Visualisation/reporting: this should provide a visualisation of the data and its dependencies and an industry standard, configurable interface.
- Architecture: provide an open architecture compliant with open standards for interoperability and extensibility, to ensure that we can utilise the technology alongside other tools and solutions.
- Integration: provide an industry standard, and open integration capability with published and documented interfaces.
- Cloud/deployment: be tested and proven for deployment in cloud infrastructure.
- Security: apply industry standard security models for access, data management and exposed solution services, compliant with the UK Data Protection Act and protection of personal data.
Your demonstration must consist of a presentation, documentation and deployable software to prove the technology demonstrator.
Proposals should include:
- a statement on the ease of use of your solution
- a clear description of why your solution is relevant to MOD and any saving to through-life costs
- an outline of any data/equipment requirements of the proposal, and how these will be met
- your demonstrator can run on your own cloud service, or MOD can provide one if required; any dependencies on the supply of data/equipment from MOD must be clearly stated
- collaborative approaches to your solution should be detailed in your proposal; these are actively encouraged
- the solution approach defined across the functional areas described in the challenges:
- data ingestion
- volumes supported
- high-level solution and technical architecture
- visualisation experience
- rules engine experience
- a software development and quality approach
- agile methods, tools and best practices
We anticipate that phase-1 projects will be 8 weeks in duration, from 5 October 2015. All projects must be complete by 30 November 2015.
5. What we don’t want
There’s potential for your solution to be taken forward into a longer-term agile-delivery capability for defence. As such, there are certain things we don’t want:
- a product locked into release and functionality cycles that don’t support agile, frequent deliveries of capability, rules changes, visualisation changes and data source/structure changes
- a capability that doesn’t support open standards and architecture; ‘plug-ability’ to existing data and applications is important
- a concept, rather than software; you must be able to demonstrate your working solution
6. Exploitation
ISS intends to take forward successful outputs of CDE-funded projects from phase 1 of this themed competition for phase-2 funding. Subject to exploitable solutions being delivered from phase 1, it’s anticipated that up to £2 million will be made available for phase 2. Funding will be considered on a per-project basis. Only bidders who are funded at phase 1 qualify for entry into phase 2 of this competition. Funding at phase 2 may be contracted through Dstl or MOD ISS and so could be subject to different terms and conditions from the phase-1 CDE contract.
Successfully funded phase-1 bidders will be supported by technical partners from Dstl and MOD during the development/preparation of your technology demonstrators.
Successful bidders will be expected to attend a demonstration event on the 30 November and 1 December 2015 to showcase your technology demonstrator to Dstl and MOD stakeholders. Costings for attending this should be included in your proposal.
As a deliverable of the phase-1 project, successful bidders will also be expected to produce a costed plan of the future work to be carried out. This should include solution, technical, deployment, interface and integration architectures; a solution and technical roadmap; and your views on how your solution could benefit defence (ie the scenarios we might use your solution for).
7. Invitation for CDE proposals
Proposals for funding must be submitted by 5pm on Thursday 13 August 2015 using the CDE online portal. Mark all proposals for this themed competition with ‘open-source big data insight’ as a prefix in the title.
The total funding available for phase 1 of this competition is £250,000. We expect to fund a number of projects up to £50,000 that can demonstrate their technology within the specified timescales and outline the next phase of work that’s required.
Phase-1 projects are expected to run from 5 October to 30 November 2015.
This competition will be supported by presentations given at the networking event on 24 June 2015. These are available to view via the event webpage.
Read important information on what all proposals must include. Proposals that don’t include the required information are unlikely to be successful.
Proposals will be assessed by subject matter experts from MOD, Dstl and MOD-contracted consultants working under a non-disclosure agreement, using the MOD Performance Assessment Framework. Outputs from successful contracts will be made available to technical partners and subject to review by UK MOD.
8. Key dates
Competition document release | by 4 June 2015 |
CDE Innovation Network event (featuring this competition briefings) | 24 June 2015 |
Webinar | 17 July 2015 |
Competition close | 13 August 2015 at 5pm |
Contract placement initiated | from 17 September 2015 |
Feedback sent to unsuccessful bidders | Early October 2015 |
Phase-1 projects | 5 October to 30 November 2015 |
Demonstration days | 30 November and 1 December 2015 |
9. Queries and help
As part of the proposal preparation process, queries and clarifications are welcomed:
Technical queries about this specific competition should be sent to [email protected].
General queries (including how to use the portal) should be sent directly to CDE at [email protected].
Capacity to answer these queries is limited in terms of volume and scope. Queries should be limited to a few simple questions or if provided with a short (few paragraphs) description of your proposal, the technical team will provide, without commitment or prejudice, broad yes/no answers. This query facility is not to be used for extensive technical discussions, detailed review of proposals or supporting the iterative development of ideas. While all reasonable efforts will be made to answer queries, CDE and MOD reserves the right to impose management controls when higher-than-average volumes of queries or resource demands restrict fair access to all potential proposers.