The entropy of galaxy spectra: how much information is encoded?

Ferreras, Ignacio; Lahav, Ofer; Somerville, Rachel S.; Silk, Joseph
Bibliographical reference

RAS Techniques and Instruments

Advertised on:
Number of authors
IAC number of authors
Refereed citations
The inverse problem of extracting the stellar population content of galaxy spectra is analysed here from a basic standpoint based on information theory. By interpreting spectra as probability distribution functions, we find that galaxy spectra have high entropy, thus leading to a rather low effective information content. The highest variation in entropy is unsurprisingly found in regions that have been well studied for decades with the conventional approach. We target a set of six spectral regions that show the highest variation in entropy - the 4000 Å break being the most informative one. As a test case with real data, we measure the entropy of a set of high-quality spectra from the Sloan Digital Sky Survey, and contrast entropy-based results with the traditional method based on line strengths. The data are classified into star-forming (SF), quiescent (Q), and active galactic nucleus (AGN) galaxies, and show - independently of any physical model - that AGN spectra can be interpreted as a transition between SF and Q galaxies, with SF galaxies featuring a more diverse variation in entropy. The high level of entanglement complicates the determination of population parameters in a robust, unbiased way, and affects traditional methods that compare models with observations, as well as machine learning (especially deep learning) algorithms that rely on the statistical properties of the data to assess the variations among spectra. Entropy provides a new avenue to improve population synthesis models so that they give a more faithful representation of real galaxy spectra.
