Browse San Francisco Policy & Issues Articles

California High Speed Rail Ridership Analysis- Excessively Constrained Numbers

  • July 26th, 2010 8:43 am PT
Samer Madanat, October 2009, Panel discussion about the challenges of High Speed Rail
Photo: Courtesty of Institute of Transportation Studies, Berkeley, University of California, Berkeley (ITS Berkeley)

In the prior 3 articles, we covered:

• The top 10 flaws of the High Speed Rail Authority’s ridership and revenue model-Letterman Top 10 of Ridership Forecasting No No's    
•  Why this model can’t predict either healthy profits or dismal losses,
Model-Deemed-Unreliable
 
• Specific decisions made by the model makers that may have inappropriately influenced the choice of the Pacheco Pass over the Altamont Pass for the route of the proposed high speed train from San Francisco to San Jose. 
Pacheco vs. Altamont

We now will discuss one more model element, constrained numbers, and how they might have played a role in the unreliability of the model.

What is a constrained number?  It’s a fixed number rather than coming from data or a formula.    These changes are usually made only in rare circumstances when there are very odd findings or inconsistent findings with the data  It should be based on the judgment of senior analysts who understand how to make other adjustments to compensate for the fixed number usage, and  who can fully understand and  explain the consequences of the change.  It’s important to know where and why changes were made

In doing a comparison of the “Final Ridership Model Report” published in August 2006 to the final disclosed numbers released January 2010, there is a huge discrepancy.  In the published report, there are 54 constrained numbers and there are 470 constrained numbers in the validation report released in January 2010. Those numbers were also not peer-reviewed.

Commentary from Norman Marshall’s Sworn Declaration

But it’s the way those adjustments were made that are particularly troubling. According to cofounder of Smart Mobility in Norman Marshall Declaration

(pg. 12-13), he says, “It is common to make constant adjustments to better match ridership data. “If those adjustments were significant it would have also been necessary to adjust the high speed rail constants as well but these adjustments need to be consistent across modes. They didn’t.”

“In the original model, the estimated frequency (headway) coefficients were all highly statistically significant (Cambridge w/Bradley Research and Consulting, Aug 2006-Table 3-15, p3-37), so the lack of statistical fit was not a basis for constraining the coefficients.  Nevertheless, in the final California high-speed rail model, the frequency (headway) coefficients were constrained to 100% of the in-vehicle time coefficient.  This implies that the effect of an additional hour between train departures on ridership is just as great as an additional hour on the train. “

“This is contrary to common sense, and if true, would cancel out much of the rationale of high-speed train service.”  Marshall goes on to say,” If the survey data resulted in this 100% ratio, it would be necessary to give it some credence, but as discussed before, the survey data indicate the ratio to be about 20% or one fifth as great.”

He also adds, “It is clear that the model constants were not properly disclosed to the public during the environmental review process. If the final model coefficients had been made available to the public during the environmental review process, any person reasonable well-versed in transportation modeling would have had serious questions about the validity of the model.  The fact that the final model was not made available to the public made detection of the model’s flaws virtually impossible.”

The commentaries below are opinions from UC Berkeley’s Professor Samer Madanat, Elizabeth Alexis, an econometrics expert and co-founder of Californians Advocating Responsible Rail Design (CARRD)

Professor Madanat’s Commentary:

Kathy Hamilton: Elizabeth Alexis describes the Cambridge process in which they created this study much like a “Whack a Mole” game in which you bang down the mole with a rubber mallet when it appears and consequently it moves to another hole and pops up again. The goal is to keep the mole from popping up .

Would you describe Cambridge Systematics’ need for excessive constrained numbers as the answer when issues kept popping up in the validation phase?

Samer Madanat: He laughs and comments that he doesn’t know that game and then said,

Yes, that’s correct, they did. If you read the press release and also our executive summary, first of all their sample was choice base, that is, the sample of questionnaires they did and developed into the main mode model. You need to realize this is a very complex model system; there are many models within it. One model predicts how many trips, another predicts destination, a third one predicts by what mode and the fourth one predicts the access and egress.

For the main mode choice model -- that is, the choice between HSR, air, car and conventional rail -- the data they used in the choice base sample, so is not a representative sample of California population.  Now that’s fine, people do this because they are interested in the behavior of some groups of the population which are difficult to capture in the random sample. So if you did the random sample of travelers you’d have mostly people who drive. You’d get very few air travelers and you’d get almost no rail users.

So what they did was they oversampled the air travelers and the rail users, and that’s fine as long as you correct for it. You correct for it in the model estimation that is the first step.

Now in one particular type of model, which is not the model they [Cambridge Systematics] have, the correction is very simple and it consists of looking at a group of coefficients in the model called the constraints. Having a choice based sample does not affect the other coefficients; it only makes the constants wrong.  As long as you validate the model against observed shares, you can correct those constants. That would have been ok if their model was of a certain type, but unfortunately their model is not of this type and that calibration method is therefore wrong.  So that is the first thing.

The validation method is wrong, because it only works for one type of model which is called the MNL(Multinomial Logit) model. The type of model they used was an NL (Nested Logic) model, [therefore] that method is wrong and has been proven wrong in a very important paper that was co-authored by Dan McFadden, Nobel Prize Laureate in Economics and the father of demand modeling.  He is the inventor of these methods. So that’s a problem.

Hamilton:  And that was proven wrong fairly recently?

Madanat:   Yes, in fairness to Cambridge Systematics, that was proven wrong in a 2008 paper after they developed their model.

Now, where they went doubly wrong was not only did they adjust the constants; they adjusted many other coefficients because things were just not matching with the observed shares of different modes. So they started playing with a number of other coefficients. In a way, when the coefficients had a certain magnitude that didn’t work for them, they changed it in a manner to match their professional judgment. Therefore they gave more weight to their professional judgment than to the data.

[Author’s comment: in other words, Whack-a-Mole.]

Hamilton: The High Speed rail authority had one report that was peer reviewed in some way and then later Cambridge Systematics came out with another report. I think it was that the MTC didn’t take the numbers or didn’t turn them over to the HSR Authority. It wasn’t peer reviewed and wondering if you saw any differences in their last report that was published in January.

Madanat:” So it wasn’t a report that was published this January , it was the final model. In their initial set of the 7 reports, what they had were the results of the original model estimation and then in the subsequent report, the calibration report, they showed how some of these co-efficients changed. And they never published the final set of parameters for all the models. They didn’t do that.

And then they we asked so what models did you use for forecasting, they said well here they are and these were not part of any of reports”. 

This is that memo that came out in January.  Model--Constants-Transmittal.pdf

Hamilton:  Were those parameters substantially different than the others?

Samer: Yes, these were the results of the calibration of that second step. [Validation]   Had I done it I would have preferred to have a final, updated report that said after we finished all the calibrations, these are all the results for all the models, the  parameters for all the models, which eventually they did produce at the request of Elizabeth Alexis.

Elizabeth Alexis’s Commentary

Hamilton: You said Berkeley stopped after they found the flawed analysis?  Did they have all the numbers, what was their scope as far as you understood?

Alexis:  Berkeley was asked for the sake of time and money to answer the first level question which was, “Is this model reliable. Should we be using it for policy?” And the answer is no. 

Using a flawed model to predict how flawed it is, is also challenging.  So the only real way to find out the real numbers is to do a new study.  The work we did suggests in a number of cases that these numbers were very high specifically for people currently driving instead of flying.

After the July 8th CHSRA board meeting, Elizabeth Alexis added this comment: "if you asked a different question such as, 'could you tell what the Pacheco or Altamont relative ridership potential was by tweaking the model', I'd say yes."  You might be able to determine the correct ratio.  This might help the environmental clearance issues.”

There were a number of mistakes that were made by poor judgment calls by Cambridge that lead to almost as many people abandoning their cars as abandoning the airlines. That’s not what their own study showed. Air was a much closer substitute for high-speed rail than autos. Around the world, it’s very competitive with flying. 

Cambridge assumed you got very limited benefits by piling extra people in the car especially for these long trips. That was one mistake they made. The second mistake they made was that they used an inflated cost of driving. That might have been relevant for local transportation at one time, but that isn’t even the model MTC uses now for their own forecast.

There were three or four or more different things that happened which led to exaggerating the number of people who would leave their cars, way beyond what happens anyplace else in the world  with much higher gas prices.        

In the recent High Speed Rail paper, HSR is showing for the San Francisco to Los Angeles market you would capture 44% of the air travel market and you would capture 40% of those currently driving.  And that’s just not correct.

Hamilton:  I believe you had access to some of the raw questionnaire data from Cambridge and that Berkeley didn’t see.  What is your professional opinion about the method that was employed in this survey and how this survey impacted the numbers? 

Alexis:  Yes, I did see that data.  There were a number of issues. First they had limited it geographically to those who were near to High Speed Rail stations which meant that the model is not going to be very reliable to predict whether people who live far away from the station will want to take the train instead of driving.  In the particular model that Cambridge did, they ended up with a large number of people driving from a very far away.  They had a large number of people who were willing to drive 100 miles to the train station.  That’s not what we see across the world.  Given they didn’t survey those people, that’s a conclusion that should have given them a lot of pause for thought

Hamilton: And, what do you think about the substitution of a constrained number vs. using the survey data, and what do you think that did to the results both financially and with ridership projections?

Alexis:  Anytime you are not going to listen to your data, it better be for a very good reason.  If you were going to change all the numbers anyway, what was the point of doing a survey?

Hamilton:  I understand there are sometimes scientific reasons why you put in a constrained number instead of using data, but  do you think from looking at the model, the numbers were excessively constrained?

Alexis: I think so. If you have to start fixing the numbers all over the place it means that your model itself is not explaining travel behavior today.  How on earth are you supposed to expect it to predict what will happen if you change some of those variables such as cost, travel time and frequency in the future.

Hamilton:  There was an oversampling of air and rail users in the survey.  What ramifications does that have?

Alexis: That always has ramifications. Sometimes people who are trying to do these kinds of studies will try to overlook because it’s much cheaper to find somebody who’s taking a trip in California. You go to the airport, you go to the gate with people who are about to get on a flight to LA. That’s easier. 100% of the people will be your target market. That’s much easier than calling up 100 people and finding 10 people who take those kinds of trips.  Sometimes you can do this, you can make some adjustment except for the fact that this is not a random sample.

In this particular case, it would be very hard to fix in this.  Also once you tried to use someone who is currently traveling to explain their hypothetical choices, nothing else mattered which is a very bad sign.  What’s incredible is that they didn’t make any adjustments and people who take planes now are much more likely to choose High Speed Rail than someone who is currently driving.  They are not the same.

And the other issue is people who choose to fly generally value their time differently than someone choosing to drive to LA.    So you’re going to have very skewed numbers.”

 Hamilton:  Do you have an opinion about the published final model vs. the final model that you found in January 2010?

Alexis:  It’s inconceivable that you would have a final report and not include the final model that was finalized a year before the report. It’s completely unprofessional to have incomplete and erroneous information in your final report.

 If you are a policy maker, your choices are to decide to have no model and fly blind and just decide what you want to do or do a new study.  On a bigger picture level we need to understand how we ended up with such a messed up survey.  This applies not just to this project, but every project we’ve done in the past and are contemplating doing in the future. The same issues apply.

 

Next article will be a discussion of the global industry practices and the failures that accompany them.

 

Learn more about other model components in recent articles and the University of California, Berkeley, Institute of Transportation Studies (ITS Berkeley) research team’s review of the California High-Speed Rail Authority’s forecasts of demand and ridership for the proposed San Francisco-to Los Angeles high–speed rail system.

 

Frequency, trainsplitting, headway

ITS Berkeley report

What do you think?

Already a member? Log in to Examiner.com. or connect with Facebook to comment

Please do not alter this field

Got something to say?

Examiner.com is looking for writers, photographers, and videographers to join the fastest growing group of local insiders. If you are interested in growing your online rep apply to be an Examiner today!