Last updated: 2018-07-07.
Purpose of Data Collection
Forwards collects data to
- identify groups that are under-represented with respect to their participation in R community activities;
- monitor how representation of these groups changes over time;
- identify reasons for under-representation, and to
- gather ideas and volunteers for actions that can be taken to encourage wider participation.
Means of Data Collection
Forwards collects data via
- its own surveys;
- collaboration with academic researchers;
- collaboration with community organizers.
Forwards will limit the data collected to that necessary for the task at hand. Since most surveys/forms will contain direct identifiers (e.g. name, email) and/or personal information (e.g. age, ethnicity, sexuality) we treat all data collected as confidential and take action to protect against breaches of personal identifiable information.
For surveys conducted by Forwards or in partnership with collaborators, we ensure a secure survey/form platform is used. In particular, we ensure the platform is compliant with the General Data Protection Regulation (GDPR) of the European Union. Access to the platform is restricted to a small number of Forwards members and/or collaborators for the purpose of collecting and downloading the data. Once the full data is downloaded the copy on the survey/form platform is deleted.
Raw data containing personal and potentially identifiable information is stored in a private GitHub repository. Access to this repository is restricted to the survey team. If a member leaves the survey team, their access to the repository will be revoked and they will be asked to delete any personal copies of the data. Survey data may be shared with selected collaborators to assist us with statistical analysis, and potentially other academic researchers whose aims are consistent with our own, for example those studying participation in open source projects.
All collaborators are expected to observe good security practices, in particular:
- working on password-protected computers that are screen-locked when unattended,
- ensuring any back-ups are password-protected or encrypted,
- not accessing private repositories over insecure WIFI connections, and
- not working in an environment where third-parties can easily over-look.
Other forms may be used to collect data for a specific purpose, e.g. applications for a diversity scholarship scheme or expressions of interest in a conference buddy scheme. The data for these forms will either be stored in a private GitHub repository (or similarly secure online storage) or offline, with access restricted to the Forwards members responsible for the task.
Raw data is kept according to the following schedules:
|Survey/form type||Retention time|
|Diversity scholarship applications||Until all scholarships paid out|
|Expression of interest form||Until task/event completed|
Data that has served its purpose will be deleted or anonymised - anonymised data may be kept to inform future activities.
Since survey data is of interest to the community at large, we do publish the
results, but only in forms that minimise risk of personal identification and
disclosure of sensitive information. This can mean presenting aggregate
summaries (as for example in this
Mapping useRs blog post), excluding certain variables, or modifying the data to achieve at least 3-anonymity over implicit identifiers (as for the
useR2016 data published in the forwards package).
Data collected in other forms will not be published or shared with third parties.