What is CABS?

This site will help high school students and teachers find original, independent science research topics and questions that can be done without a professional lab...these can be done in a school lab or even in one's basement! The project ideas and research questions being developed and presented here have been vetted and could lead to true discoveries, and not just finding already known results. See our Welcome message. These are the types of projects that could be done and submitted to high school contests such as the Regeneron Science Talent Search, Junior Science and Humanities Symposium, or the Intel International Science and Engineering Fair, and be competitive. If you have an idea to share, or a question about one of the project ideas, contact us at vondracekm@eths202.org.

Pages (on the right side of the screen) have lists of ideas for different types of science research projects, and clicking on one of those ideas will take you to posts with details and all sorts of information about that type of project. Get more information about why there is a need for CABS!

Online Data Sets Research Ideas

Below are project topic and research question ideas. These have been checked and should allow for original research opportunities - that is, you could make actual discoveries and novel findings for each of these projects! And  you should not need access to a professional laboratory, but rather can access data sets online in your school lab or even in your own house.

In general, 'Big Data' and Data Science are already here in many fields of STEM. This link goes to a good summary article and video from Northwestern University, one of the leaders in Data Science.

It is important to realize that
this type of research requires computer programming 
knowledge and skills
One of the most popular languages presently for data collection and analysis is Python, and is the language many graduate students in research areas recommend learning; there are countless tutorials, YouTube videos, and pieces of information online for Python. The most popular free, online course to learn Python (as well as some other languages) is through Codecademy, although there are many other sources and tutorials online. If you are new to programming, open an account and start a self-paced course in Python!

Just click on a topic of interest, and it will take you to a separate post about that specific topic. You will find background information, relevant links to articles, vocabulary, data accessibility methods, videos, and so on. Hopefully you will find enough information to actually be able to do the project!

Online Data Set Topics and Research Questions:

Datasets Galore - Multiple Topics

For datasets on just about any topic you can imagine, check out over 1200 found at https://www.kaggle.com/datasets.


Examples of student research papers using online astronomical datasets:
- Morphological classification of Post-Starburst Galaxies
- Albedo and Heat Recirculation of Hot Jupiter Exoplanets - Effect of Phase Shifts
- The Effect of the Asteroid Belt on the LISA Mission
- Environment and Variability of XBONGs (type of X-ray galaxy)

NASA Open Data Portal

https://data.nasa.gov/ This is NASA's primary site where we can access numerous datasets from different astrophysical experiments! There are ports to find code, as well. An incredibly rich resource for those interested in doing any type of data-based astrophysical projects! Other specific experimental sites are listed below. Astronomical Experiments and Datasets:
European Space Agency Planetary Science Archive
http://www.rssd.esa.int/index.php?project=PSA  Central data repository for ESA missions: currently Giotto,Huygens, Mars Express, Rosetta, SMART-1, and Venus Express, as well as several ground-based cometary observations

http://exoplanets.org/  One of the premiere sites, with up-to-date data on thousands of exosolar planets and candidates.

Global Telescope Network
http://gtn.sonoma.edu/data_reduction/index.php Data reduction site

Kepler databases - the search for extrasolar planets; Kepler Planet Candidate Data Explorer, which is called Planetquest; the main Kepler site 

NASA Exoplanet Archive
http://exoplanetarchive.ipac.caltech.edu/  Another site with thousands of datasets for exosolar planets.

NASA Space Science Data Coordinated Archive
Each of the following have numerous links to individual projects/missions:
Solar system exploration: http://nssdc.gsfc.nasa.gov/planetary/

Sloan Digital Sky Survey (SDSS)
Sloan Digital Sky Survey datasets can be found here, along with tutorials of how to access data. 

Spitzer Science Center
http://ssc.spitzer.caltech.edu/  space telescope with infrared

Variable Star Data

Zooniverse (general)
Over 40 different citizen science projects, where you can help scientists in a variety of fields make sense of enormous data sets!

Two other citizen scientist sites are:

Some instructions for accessing certain Astrophysics Datasets:

1. Create Chandra Images from Raw Data

Basic steps:

1. Download .FITS data files either from X-ray or Multi-wavelength  images that be found here http://chandra.harvard.edu/photo/openFITS/xray_data.htmland http://chandra.harvard.edu/photo/openFITS/multiwavelength_data.html respectively
2. Download free image editor program from https://www.gimp.org/
3. Create Chandra images from raw data as described in an example here http://chandra.si.edu/photo/openFITS/crab.html

2. X-ray spectroscopy of supernova remnants

Basic steps:

1. Download, install and open ds9 program
2. Download ds9 image data files of supernova remnants
2. Follow some basic analysis instructions described in detail here http://www.chandra.si.edu/edu/formal/snr/ds9.html
3. Classify a supernova event as type Ia or type II  the spectra and compare the result with the information in the Photo Album http://chandra.harvard.edu/photo/category/snr.html

3.  Galaxy classification and evolution with GalaxyZoo

1. Use the data from https://data.galaxyzoo.org/

4.Interpreting data with photometric transits

5. Tracking Jupiter's moons using image processing software to analyze observatory images of Jupiter and its moons

6. Exoplanet transits using telescope images, image processing software, and data from the Internet to determine the size and orbital period of an exoplanet

9. More classroom activities on planet finding

10. More activities based on galaxyzoo

11. More activities on Chandra X-ray observatory

Long List of Astronomy Programs with Real Data


HIV Databases: Includes Sequence, Vaccine, Immunology databases for HIV, and data for other viruses (Hepatitis C and Hemorrhagic fever). Includes some tools to look at data.

Human Genome Resources and Databases

NIH Cancer Institute
http://www.cancer.gov/research/resources/data-catalog  data collections from NCI initiatives

Geoscience & Climate Science

Examples of student research papers using online geoscience datasets:
- Antarctic Sea Ice Fractal Analysis (using fractal dimension as measure of change due to warming)
- Scaling law for Strength and Frequency of High-Energy Storms

The NOAA site on paleoclimate data:

There is also one for seismic data:

And of course, there is always NASA, which holds a lot of earth science data:

There is a journal that specifically publishes earth system science data sets:

And Stanford compiled a list of earth science data sources:

USGS Geomagnetism Program

NOAA Geomagnetism Site

NOAA: Climate interests
NCEI is the world’s largest provider of weather and climate data. Land-based, marine, model, radar, weather balloon, satellite, and paleoclimatic are just a few of the types of datasets available. Detailed descriptions of the available products and platforms are below.

  • These links provide quick access to many of NCEI's climate and weather datasets, products, and various web pages and resources.
  • Land-based, or surface, observations include temperature, dew point, relative humidity, precipitation, wind speed and direction, visibility, atmospheric pressure, and types of weather occurrences such as hail, fog, and thunder collected for locations on every continent.
  • Geostationary and polar-orbiting satellites provide raw radiance data collected by ground stations to help monitor and predict weather and environmental events.
  • An acronym for Radio Detection and Ranging, a radar is an object-detection system that uses radio waves to determine the range, altitude, direction of movement, and speed of objects producing raw data as well as generating analysis products.
  • Access to near-real-time, high-volume numerical weather prediction and global climate models and data. Looking into the past, present, and future to assist in the analysis of multidisciplinary datasets and promote interoperable data analysis.
  • Weather data from the atmosphere, beginning at three meters above the Earth’s surface. These data are obtained from radiosondes, which are instrument packages tethered to balloons that transmit data back to the receiving station.
  • Meteorological data transmitted from ships at sea, moored and drifting buoys, coastal stations, rigs, and platforms. The data may include weather as well as ocean state information.
  • Past climate and environmental data, derived from natural sources such as tree rings, ice cores, corals, and ocean and lake sediments, extend the archive of weather and climate back hundreds of millions of years.
  • Archive of destructive storm or weather data and information, which includes local, intense, and damaging events such as thunderstorms, hailstorms, and tornadoes. It can also describe more widespread events such as tropical systems, blizzards, nor’easters, and derechos.

Particle Physics

Examples of student research papers using online particle datasets:
- Organizational Properties of Baryonic Decays

The CERN CMS experiment has put an incredibly large dataset online, free for anyone to use. Check this article out for more details.

CERN Open Data

Fermilab - for classes to use as lessons

Social Networks

Examples of student research papers using online datasets of social and professional networks:


  1. Hi,

    A couple of thoughts:

    1. With the wide availability of deep learning technology now available in open source, one area of research is to apply neural networks or other machine learning techniques to find patterns or classifications in these data sets.

    2. Here are some more public data sets: https://github.com/caesar0301/awesome-public-datasets

    Mark Morris
    ETHS 1978

    1. Thank you for this wonderful suggestion, and also a very good listing of datasets! This page is obviously a work in progress, and the hope is a group I am working with at NU will get a NSF grant so we can develop the help and 'how to' resources so anyone who would like to do research on such datasets will be able to, along with specific, possible research questions to pursue (starting off with astrophysics). If you are aware of such resources for neural networks and machine learning, I would love to add them. Thanks again, and Go Kits!