Monday, June 3, 2019

Statistical Analysis of Train Arrival Times

Statistical Analysis of Train Arrival TimesIntroductionIn carrying come in this project, the Researcher will be able to provide the reader with the findings of the project works through the use of Class Material and Statistical data still which was conveyed using the concrete-magazine abstract and Irish Rails Annual Report. In doing so, this will establish in short the output response by guidance of the chains arrival time. In addition, producing regression analysis graphs in terms of Null and Alternative Hypotheses being asked and relayed through the workings of Minitab (analysis of variance ( maven way factor) will determine the P-value by way of the physique of the experiment (DoE)The project will be boil downing on the requirements set under the Public Service Contract between the field point Authority and Iarnrd ireann Concerning Compensation for Public Service Obligations pursuant to capital of Ireland Transport Authority Act 2008 (as amended by the Public Transpor t Regulation Act 2009) and EC Regulation 1370/2007, Schedule B Performance Obligations of Iarnrd ireann to complete this project (Irish Rail 2017).This report will provide tell information to be documented and relayed throughout this project, such asDescribing the process you atomic number 18 analysing, generically and technically. public figureing an experiment that will allow you to determine the perfume of the factor on the output response, run the experiment and gather the data appropriately and taking into account Sample size, Randomization, Independence and Previous government issues available.Providing statistical analysis of the experiment and describing the statistical evidence collected, in terms of Null and Alternative HypothesesShow results of your experiment. In particular, does the factor demand the output response? Does your assumption, statistical and technical, come out reasonable for these data collected?Identify one area of weakness in your study and/or yo ur results and suggest how a new study could check over it to im invoke the performance of the process. (Luu ,2017)In completion of this project the research hopes to r severally an agreement based on the train time performance obligations over against the researchers real-time data collected. This in turn has given rise to the question being asked by way of the Null and Alternative Hypotheses. In doing so it produces findings by way of graphs which focus on the residual analysis produced and P-value based on Irish rails 95% confidence requirement.Thereafter, determining the level of compliance being corresponded by Irish rail against real-time analysis output response carried out over a two week period. Results were earnd through the use of Irish Rail real time software application.Finally, the researcher himself has an added interest in the findings as he has spent the last 5 years using the service from capital of Ireland to Galway and Galway to Dublin collectively and is ful ly aware of post -arrival time delays and their add -on makeuate that he has endured on arriving at his final destination and the knock on effects interlinked.Irish Rail company profile Iarnrd ireann is a service that provides passengers and freight rail services both intercity and regional services. Operating between Dublin, Belfast, Sligo, Ballina, Westport, Galway, Limerick, Ennis, Tralee, Cork, Waterford and Rosslare, Europort and Iarnrd ireann jointly operates the Dublin to Belfast Enterprise service with Northern Ireland Railways.( Irish Rail 2017)In addition the DART service operates between Greystones and Howth/Malahide. It in any case runs a commuter service in the Dublin area between Gorey, Drogheda, (Irish Rail 2017)Performance obligation punctuality and reliability Track RecordsPunctuality is set by the National Transport Authority (NTA) for all routes which regulates Irish Rails performance rates. Punctuality is defined as on time or within 10 minutes of arrival ti mes. Delays outside of Iarnrd ireanns control are trucks hitting bridges or extreme weather conditions such as snow or fog (Irish Rail 2017). The NTA performance reports under the Public Service Obligation contract is measured against Iarnrd ireann punctuality records. This excludes delays as stated above and figures are independently verified by the NTA. Reliability simple put is whether the train operates or not.Train performance in terms of punctuality and reliability Galway to Dublin track performance results 2016PeriodDatesPunctualityReliability01Jan 01 to Jan 3195.1% one hundred%02Feb 01 to Feb 2896.4% coulomb%03Feb 29 to Mar 2796.8% ascorbic acid%04Mar 28 to Apr 2494.9% hundred%05Apr 25 to May 2295.4%100%06May 23 to Jun 1995.5%99.38%07Jun 20 to Jul 1794.4%100%08Jul 18 to Aug 1494.3%100%09Aug 15 to Sep 1196.7%100%10Sep 12 to Oct 0997.9%100%11Oct 10 to Nov 0693.4%100%12Nov 07 to decline 0492.6%99.70%13Dec 05 to Dec 310.00%0.00%Galway to Dublin track performance results 2015 (I rish Rail 2017)PeriodDatesPunctualityReliability01Jan 01 to Jan 2592.2%99.83%02Jan 26 to Feb 2298%100%03Feb 23 to Mar 2295.2%100%04Mar 23 to Apr 1995.8%100%05Apr 20 to May 1792.6%100%06May 18 to Jun 1496.9%100%07Jun 15 to Jul 1295.5%100%08Jul 13 to Aug 0993.3%100%09Aug 10 to Sep 0694.9%100%10Sep 07 to Oct 0496.3%100%11Oct 05 to Nov 0188.8%99.07%12Nov 02 to Nov 2980.2%99.69%13Nov 30 to Dec 3191.5%100%Galway to Dublin track performance results 2014 (Irish Rail 2017)PeriodDatesPunctualityReliability01Jan 01 to Jan 2695.2%99.83%02Jan 27 to Feb 2391.2%100%03Feb 24 to Mar 2394.3%100%04Mar 24 to Apr 2097.7%100%05Apr 21 to May 1896.1%100%06May 19 to Jun 1596.5%100%07Jun 16 to Jul 1394.3%100%08Jul 14 to Aug 1094.8%100%09**Aug 11 to Sep 0798.6%100%10Sep 08 to Oct 0595.8%100%11Oct 06 to Nov 0290.4%100%12Nov 03 to Nov 3089.8%100%13Dec 01 to Dec 2896.6%99.71%(Irish Rail 2017)Design of Experiment In the undertaking of this project the design of the experiments (DoE) accusatory was to discover if the punctuality (Train Delay) of real time analysis meets the requirements set and if the tasks could be repeated would we get the same results or could the process be improved to achieve better results. In carrying out this experiment the topic choice untaken was to be of signifi terminatet value to achieve the right information which in turn helps to design the experiment in the right manner oppositewise this information could be mixed up with whatsoeverthing else such as an observational study (Reilly 2017, pg 109). To differentiate from the above, a design of experiments was sets out to identify causes that may enable us to dislodge the behaviour pattern and help improve the process.In achieving my final results this experiment considered the effect of a factor (Time of Day) on an output response (Different times of the Day). Additionally this experiment placed emphasis on a number of different train times (factor levels) which were hit-or-missly selected beforehand consis ting of Peak and None peak times during the week.Hypothesis TestingTo get a true value one can only assume that what is being relayed by the company is accurate. This should not be taken as being accurate until otherwise proven save one can only assume that the vigor hypothesis is true. In order to get a true reflection in statistics the theory of political campaigning is called the trifling hypothesis (H0). Hypothesis is another word for theory, and it is zero point because at the outset it is neither proven nor disproven (Reilly 2017, pg. 68).In the task being carried out the objective is to prove or disprove that Irish rails punctuality clams conceive is at 95% and to show how close or far from the 95% it is. therefore you ease up to ask the question. What is the probability of the data, assuming that the null hypothesis is true, this probability is called the p-value (Reilly 2017, pg. 68).Then using the standard = 0.05 cut-off, the null hypothesis is rejected when p p .05 also known as type one and two errors.The Null Hypothesis being asked.The null hypothesis (H0) Different times of the day does not affect the time of day. This means that all the factor levels according to the null hypothesis, that random variation is only present.The alternative hypothesis (H1) Different times of the day does have effect on the time of day. This means that the factor does have an effect on the response and that some of the variation in the response is explained by the factor.Single- reckon Experiments and ANOVA by softwareFor the purpose of this project, A single-factor experiment was to be carried out, which considered the effect of one factor on a response as stated prior. Furthermore, other factors that could affect the arrival time, such as accidents on bridges and extreme weather conditions, where kept constant during this experiment by applying the principles of experiment design.Principles of Experimental Design in this caseReplication To get a true mea surement for each response three test was carried out for each factor level over two weeks. This then allows you to see how lots random variationoccurs in the response even when the factor level remains the same, otherwise known as the error variance (Reilly 2017, pg 109). It must be noted that The term error does not convey as a mistake in this instance but simply takes into account impacts that could affect the overall result such as environmental impacts along with other underling facts for example driver error, trains not release stations on time, other trains impacts and peak travel times. Furthermore, it also must be noted that the learning effect should have no effect on the over final results in this case.The learning effects in this instance should not be correlated with random variation as this only amounts to unexplained variation and not with explained variation such as driver training which should be carried out under prior supervision in order to communicate the lear ning effect.Randomisation In order to achieve a true reflection of the project in hand, the researcher projects logistics required the experiment to be performed in a random run and not that of fixed method. In order to achieve a fare random selection the researcher randomly picked old age of the week Monday to Friday and every first to third day or secant to fourth day simultaneously. This in turn allowed each train time on the track equal status over the project phase and not allowing any factor to be more prominent over another during the test regardless of the time permitted by the company Principles of Experimental Design in this case. The reason for this is that there may be some progressive change as stated prior addressing any concerns.Blocking Blocking was taken into account in this case but after the fact and not prior to the random section. This only became apparent to the researcher as the different days of the week could have an effect on the output response, as can be seen in the data below in some small part in this case. To get a true reflection of this the researcher would repeat the test again in a different blocking manner to see if the start of the week output response is greater than the end of the week over a longer period of time as the number of people travelling declines as the week progresses. This is one element of the test if repeated the researcher would face at in more detail.Data Collected from Real- time AnalysisDifferent Time of Day (response)Time Of Day (Factor Levels)MONWEDFRIMONWEDFRI0630 08415450540930 120012682431305 15437630281505 174281419731920 21474-281641RESULTSDescriptive Statistics Different time of the day Variable Levels N N* Mean SE Mean StDev Minimum Q1 MedianDifferent time of the day 0630 0841 4 0 4.500 0.289 0.577 4.000 4.000 4.5000630 0841 2 0 2.50 2.50 3.54 0.00 * 2.500930- 1200 2 0 7.00 5.00 7.07 2.00 * 7.000930 1200 4 0 5.25 1.11 2.22 3.00 3.25 5.001305 1543 6 0 4.33 1.28 3.14 0.00 1.50 4.501505 1742 4 0 5.25 1.93 3.86 1.00 1.50 5.501505 -1742 2 0 10.50 3.50 4.95 7.00 * 10.501920 2147 6 0 5.83 2.26 5.53 1.00 1.75 4.00One-way ANOVA Different time of the day versus factor Levels 4 in 1 overviewMethodNull hypothesis entirely means are equalAlternative hypothesis At least one mean is differentSignificance level = 0.05Equal variances were assumed for the analysis.Factor InformationFactor Levels ValuesLevels 8 0630 0841, 0630 0841, 0930- 1200, 0930 1200, 1305 1543,1505 1742, 1505 -1742, 1920 2147Model SummaryS R-sq R-sq(adj) R-sq(pred)3.98672 19.61% 0.00% 0.00%MeansLevels N Mean StDev 95% CI0630 0841 4 4.500 0.577 (0.366, 8.634)0630 0841 2 2.50 3.54 (-3.35, 8.35)0930- 1200 2 7.00 7.07 ( 1.15, 12.85)0930 1200 4 5.25 2.22 ( 1.12, 9.38)1305 1543 6 4.33 3.14 ( 0.96, 7.71)1505 1742 4 5.25 3.86 ( 1.12, 9.38)1505 -1742 2 10.50 4.95 ( 4.65, 16.35)1920 2147 6 5.83 5.53 ( 2.46, 9.21)Pooled StDev = 3.98672 simple regression Analysis Analysis of VarianceIn carrying out the R egression Analysis it is important to understand that the first hypothesis in regression formula is (H0 = 0) or more importantly it accounts for the P-Value (Levels) in this case. In addition the null hypothesis states that X is not a useful predictor of Y, or graphically Meaning the H0 = 0 regression line is horizontal. Subsequently, If the null hypothesis is accepted, it may then indicate that there might be no predictive relationship at all between X and Y, and the analysis is over. But if this null hypothesis is rejected, it indicates that there is a predictive relationship between X and Y, and so it is useful to construct a regression equation for predicting values of Y. The second hypothesis is H0 = 0 is not accounted for in this case as there is no constant present in results as stated below (Reilly 2017, pg. 97).Alternatively, If the null hypothesis is accepted, this means that the regression line may pass through the origin or that Y is directly proportional to X, so tha t any change in X would be matched by an identical percentage change in Y (Reilly 2017, pg. 98). Minitab by software output results as stated bellow. extension DF Adj SS Adj MS F-Value P-ValueLevels 7 85.30 12.19 0.77 0.621Error 22 349.67 15.89Total 29 434.97The p-value for levels is 0.621, which is less than 5%, so we reject the hypothesis that the regression line is horizontal in this case.One-way ANOVA individual Observational DataResiduals vs Fits for Different time of the day In this case you can notice that on-peak times have consistently lower scores than the other train times. You also notice that the x-axis marks are unequally spaced. The length between the ticks is proportional to the number of scores (observations) for each arrival timeThe following observations NotedThe lines near the centre of each line represent the arrival mean. At a glance, you can see that the mean for each arrival looks significantly different.The vertical span of each line represents the 95% confi dence interval for the mean of each arrival.Additional Observational DataNormal plot of Residuals for Different time of the day In this case the plot above indicates that the arrival times are reasonably normal. There is some scatter with one outlier however the points are roughly speaking linear in this instance.Residual Histogram for Different time of the day In this case the result shows that the data are positively skewed (To The right). This means that the trains delay times might be much longer than expected, but could not be much earlier than expected, because the train cannot leave the last station prior to its scheduled time.Residuals vs Order for Different time of the day In this case the time serial plot shows a spike, this shows where there was a late arrival outside the expected arrival time before returning to expected level again. This correlates with outlier in the additional data stated above.ConclusionReferenceshttp//www.irishrail.ie/about-us/train-performancehttp //www.irishrail.ie/about-us/2014-performancehttp//www.irishrail.ie/about-us/2015-performance

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.