Simply Science

Baseballs and Blenders


In our lab we do a good deal of mass spectrometry (ms). That is we perform mass spectral analysis of a variety of molecules including proteins, natural products of metabolism, (metabolites), and drugs. I'll discuss why we do this later in the article. But what does mass spectral analysis mean? Essentially what we are doing is measuring the weight of unknown molecules, comparing their weight to that of known molecules or standards, and using that relationship to determine their identity. Sometimes different molecules, proteins for example, weigh essentially the same amount. In that case simply measuring their mass isn't good enough because we still aren't able to distinquish between them. We can get around this by breaking them into smaller fragments. If the fragmentation occurs in a predictable way, the weight of the resulting fragments from the two proteins will often be sufficiently different to make an identification. If not, we simply break those fragments into still smaller fragments and measure those masses. Let's apply this concept to something we are familiar with to make sense of it.

Is this a bowling ball? Pretend for a moment that you are completely unfamiliar with balls used in American sports. Suppose you are given a baseball, a bowling ball and a basketball but you don't know which is which because you know nothing about them (of course you could simply look them up on Wikipedia but that wouldn't serve the purpose of my analogy!). You are provided a list of balls along with general descriptions including circumference, weight, material from which they are made, how high they bounce when dropped from waist level onto a concrete floor, and how far they can be thrown by the average person. This list is your database from which you will make identifications. By weighing, dropping and throwing each, and comparing your results to the information in the database, you could correctly determine which ball is which. This is a very basic and somewhat silly example I know, but it makes the point.

But let's say you are given several different baseballs and charged with the task of determining the identity of each. There are lots of different kinds of baseballs, many of which are very similar yet distinct. For example there are those that are made of rubber throughout. The more "serious" baseballs have at their center, a core. The core is made of rubber, cork, or both. Twine or wool is then wrapped around the core. Finally a cover, is applied. In regulation baseballs this cover is leather, although synthetic materials can be used to cover non-regulation baseballs. Let's proceed on the assumption that you have three different baseballs manufactured by three different companies, companies X,Y and Z. Your job is to determine which company makes each baseball.

Knock the cover off the ball. If you take a really sharp dicer and chop your baseballs into smaller pieces, you can gain a certain amount of information by weighing pieces from each, and comparing your mass results to information in the database. The database includes information on the manufacturer, material used to make the baseball, and the weight of all materials when chopped up by a dicer.

You fragment the first baseball and see that it is made out of only one material. You weigh a set number of "molecules" of this material, and by comparing your mass data to the database, you determine this baseball is made of rubber. Since X is the only company that makes its baseballs entirely out of rubber, you know that this is a rubber baseball manufactured by company X. Simple.

Now for baseball number 2. After dicing, you find fragments of several different materials. You weigh a set number of "molecules" of each material and compare your mass data to the database. But this time its not so easy. Based on the information in the database you can't determine whether the baseball is manufactured by Y or Z. Undaunted you try baseball number 3 and get the exact same results. It turns out that except for the core, companies Y and Z make their baseballs from exactly the same materials. So at this point you are stuck. But not really, because all you need is more information.

So you take your fragments obtained from dicing and place them in a really powerful blender. Now you are applying additional energy to the fragments, and as a result, you break them into smaller units. This allows you to fragment the core of the baseballs which you weren't able to do with the dicer. You weigh the fragments from each core and refer once again to your database. Now you can make a positive identification because company Y makes their cores out of cork, and company Z makes theirs out of a combination of cork and rubber. The positive identification is possible because cork and rubber have distinct masses. Finally you obtained the information you were after. You just needed to examine the baseball materials in sufficient detail.

The real world. Lets relate the example above to what actually takes place in a mass spectrometer. Unlike our baseball example, when we analyze molecules by mass spectrometry, we apply a charge to the molecules, converting them to ions. So we are actually measuring the mass and the charge to obtain a mass-to-charge ratio, expressed as m/z. The charge allows us to detect the ion, and also determines how it behaves in electrical or magnetic fields. Mass spectrometers are comprised of three primary components; a source, a mass analyzer and a detector. There are several types of each used in mass spectrometry instruments today. The source applies the charge and presents the ion to the mass analyzer. The mass analyzer separates the ions by various means so that they reach the detector at different times. One type of mass analyzer is a time-of-flight tube. As they enter the flight tube, ions are given an electrical "push" to send them down the tube. Smaller ions reach the end of the tube before the heavier, slower ions. Finally at the end of the tube, the detector records an electrical signal as each ion arrives. Because the instrument records the exact time when the "push" occured, the time required for the ions to "fly" down the length of the tube can be calculated, and the mass-to-charge ratio determined. The ms instrument is pre-tuned using molecules of known mass, and so the precise mass of unknown molecules can be calculated.

No, we don't really use blenders. I mentioned that we look at a variety of different molecules using ms. Let's focus on proteins. Mass spectral analysis of proteins is typically referred to as "proteomics" (although any means of studying the structure, function, etc of proteins should be considered proteomics). The average size of a protein is around 50,000 daltons. That's the same as 50,000 hydrogen atoms. We can measure the m/z of whole proteins, but typically this doesn't give us enough information. There may be hundreds of proteins that weigh 50,000 daltons. In other words, we usually aren't comparing baseballs to bowling balls, but rather very similar baseballs. Typically, to initially fragment proteins we use proteases. These are enzymes that cut, or digest, proteins into smaller fragments. Some proteases, such as trypsin, cut proteins at specific sites (trypsin is secreted in the small intenstine, and plays a role in food digestion). Site specificity is important because our database search engines take advantage of this characteristic. When we compare our m/z data to the database, the search engine performs a theoretical, or in-silica, digest of all proteins. Each protein has a signature of fragments, or peptides, that are generated by enzymatic digestion. Peptides are comprised of a chain of amino acids, the building blocks of proteins. The peptide signature obtained from the theoretical digest, based on m/z, is compared to the results obtained from the mass spectrometer. When possible this information is then used to correctly identify the protein.

Rubber or cork? But just as many in-tact proteins have the same mass, so do many peptide fragments. In these cases, we need more information to distinguish cork from rubber. So we apply more energy to the ions to fragment them. Fragmentation also occurs in a predictable fashion. Thus we are able to go back to our database and mine it for additional information. If that still isn't enough? You guessed it. We fragment those fragments and try again.

A quick comment about databases: Not all proteins in your body have been identified or characterized. So how can we theoretically identify all proteins using databases? Proteins are encoded by DNA. Genes carry the information that is translated (via RNA) into a whole, functional protein. Although there is no database that contains all the structural information for all proteins, there are databases that contain the entire genetic sequence found in our cells. During a database search, gene sequences, either known or predicted genes, are converted into protein sequence. So when we perform proteomic analysis using mass spectrometry and database searching, we are actually looking at DNA sequence information, not protein.

Why spend your time identifying molecules? There are a variety of reasons. Sometimes we know exactly what we are looking for. We just want to know how much of it is in a biological sample. A good example is drug monitoring. Mass spectrometry is used to measure the amounts of a drug in your blood. This technique is used to detect illegal drugs in people of interest, such as baseball players. It has clinical uses as well. When a patient is given a drug during clinical therapy, the effectiveness of that drug is dependent in part on how quickly it is cleared from the patients blood. Mass spectrometry is used to determine how much drug is available in the blood stream. The physician can then either prescribe a higher dose, or switch to a different drug.

Another reason to do what we do is the hope of biomarker discovery. The best chance to effectively treat patients for potentially fatal or debilitating diseases, is to diagnose them accurately and early in the progression of disease. Even better, potentially, is to determine who is likely to get a specific disease before they actually have it. Biomarkers are clues that are used to make these determinations. Biomarkers can be genes, protein or metabolites. They can also be physiological measurements such as heart rate, or lung volume. Although not an easy task, biomarker discovery is a hot area of biomedical research. The goal is to find biomolecules that can predict or detect disease, as well as to monitor effectiveness of clinical therapy. This is an area that holds great promise for managing human diseases.

SystemsBiologyTraining | Courses | Instructors | Location | Contact Us | Links | Reisdorph Lab