Tài liệu Statistical Issues In Interactive Web-based Public Health Data Dissemination Systems doc

55 327 0
Tài liệu Statistical Issues In Interactive Web-based Public Health Data Dissemination Systems doc

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

Thông tin tài liệu

Statistical Issues in Interactive Web-based Public Health Data Dissemination Systems MICHAEL A STOTO WR-106 October 2003 Prepared for the National Association of Public Health Statistics and Information Systems Statistical Issues in Interactive Web-based Public Health Data Dissemination Systems EXECUTIVE SUMMARY State- and community-level public health data are increasingly being made available on the World Wide Web for the use of professionals and the public The goal of this paper is to identify and address the statistical issues associated with these interactive data dissemination systems The analysis is based on telephone interviews with 14 individuals in five states involved with the development and use of seven distinct interactive web-based public health data dissemination systems, as well as experimentation with the systems themselves Interactive web-based systems offer state health data centers an important opportunity to disseminate data to public health professionals, local government officials, and community leaders, and in the process raise the profile of health issues and involve more people in community-level decision making The primary statistical concerns with web-based dissemination systems relate to the small number of individuals in the cells of tables when the analysis is focused on small geographic areas or in other ways In particular, data for small population groups can be lacking in statistical reliability, and also can have the potential for releasing confidential information about individuals These concerns are present in all statistical publications, but are more acute in web-based systems because of their focus on presenting data for small geographical areas Statistical Issues in Web-Based Public Health Data Systems Small numbers contributing to a lack of statistical reliability One statistical concern with web-based dissemination systems is the potential loss of statistical reliability due to small numbers This is a concern in all statistical publications, but it is more acute in web-based systems because of their focus on presenting data for small geographical areas and other small groups of individuals There are a number of statistical techniques that interactive data dissemination systems can use to deal with the lack of reliability resulting from small cell sizes Aggregation approaches can help, but information is lost Small cells can be suppressed, but even more information is lost (The best rationale for numerator-based data suppression is confidentiality protection, not statistical reliability.) In general, approaches that use statistical approaches to quantify the uncertainty (such as confidence intervals and the use of c2 tests), or to smoothing, or small area model-based estimation, should be preferred to options that suppress data or give counts but not rates Small numbers and confidentiality concerns The primary means for protecting confidentiality in web-based data dissemination systems, as in more traditional dissemination systems, is the suppression of “small” cells, plus complementary cells, in tables The definition of “small” varies by state, and often by dataset This approach often results in a substantial loss of information and utility Statisticians in a number of state health data centers have recently reconsidered data suppression guidelines currently in use and have developed Statistical Issues in Web-Based Public Health Data Systems creative and thoughtful new approaches, as indicated above Their analyses, however, have not been guided by theory or statistical and ethical principles, and have not taken account of extensive research on these issues and development of new methods that has taken place in the last two decades Government and academic statisticians, largely outside of public health, have developed a variety of “perturbation” methods such as “data swapping” and “controlled rounding” that can limit disclosure risk while maximizing information available to the user The Census Bureau has developed a “confidentiality edit” to prevent the disclosure of personal data in tabular presentations The disclosure problem can be formulated as a statistical decision problem that explicitly balances the loss that is associated with the possibility of disclosure and the loss associated with nonpublication of data Such theory-based and principled approaches should be encouraged Concept validity and data standards Statisticians have been concerned ever since computers were introduced that the availability of data and statistical software would lead untrained users to make mistakes While this is probably true to some extent, restricting access to data and software is not likely to succeed in public health The introduction of interactive web-based dissemination systems, on the other hand, should be seen as an important opportunity to develop and extend data standards in public health data systems Web-based dissemination systems, because they require that multiple data systems be put into a common format, present opportunities to disseminate Statistical Issues in Web-Based Public Health Data Systems appropriate data standards and to unify state data systems Educational efforts building on the dissemination software itself, as well as in more traditional settings, are likely to be more effective in reducing improper use of data than restricting access For many users, such training will need to include content on using public health data, not just on using web-based systems The development of standard reports for web-based systems can be an effective means for disseminating data standards Data validation No statistical techniques can guarantee that there will be no errors in webbased data systems Careful and constant checking of both the data and the dissemination system, as well as a policy of releasing the same data files to all users, however, can substantially reduce the likelihood of errors Methods for validation should be documented and shared among states The development of web-based dissemination systems is an opportunity to implement data standards rather than a problem to be solved Efforts to check the validity of the data for web dissemination purposes may actually improve overall data quality in state public health data systems General comments The further development and use of web-based data dissemination systems will depend on a good understanding of the systems’ users and their needs System designers will have to balance between enabling users and protecting users from themselves Systems will also have to develop ways to Statistical Issues in Web-Based Public Health Data Systems train users not only in how to use the systems themselves, but also on statistical issues in general and the use of public health data Research to develop and implement new statistical methods, and to better understand and address users’ needs, is a major investment Most states not have the resources to this on their own Federal agencies, in particular through CDC’s Assessment Initiative, could help by enabling states to share information with one another, and by supporting research on the use of new statistical methods and on data system users Statistical Issues in Web-Based Public Health Data Systems INTRODUCTION State- and community-level public health data are increasingly being made available on the World Wide Web for the use of professionals and the public Although most data of this sort currently available are simply static presentations of reports that have previously been available in printed form, interactive web-based systems are increasingly common (Friedman et al, 2001) The goal of this paper is to identify and address the statistical issues associated with interactive web-based state health data dissemination systems This will include assessing the current data standards, guidelines, and/or best practices used by states in their dissemination of data via the Web for both static presentation of data and interactive querying of data sets and analyzing the statistical standards and data dissemination policies, including practices to ensure compliance with privacy and confidentiality laws Many of the same statistical issues apply to public health data however published, but interactive web-based systems make certain issues more acute In addition, identifying and addressing these issues for interactive systems may also lead to overall state health data system improvement This analysis is based on telephone interviews with 14 individuals in five states involved with the development and use of seven distinct interactive webbased public health data dissemination systems, as well as experimentation with the systems themselves All but one of the systems are currently in operation, but most are constantly being updated The interviewees and information on the sites appears in Appendix A The choice of these individuals and states was not intended to be exhaustive or representative, but to bring out as many statistical Statistical Issues in Web-Based Public Health Data Systems issues as possible In addition, a preliminary draft of this paper was circulated for comment and was discussed at a two-day workshop at Harvard School of Public Health in August, 2002; attendees are listed in Appendix B The current draft reflects comments by e-mail and at the workshop, but the analysis and conclusions are the author’s, as well as any errors that may remain This paper begins with a background section that addresses the purposes, users and benefits of interactive data dissemination systems, systems currently in place or being developed, and database design as it affects statistical issues The body of the paper is organized around four substantive areas: (1) small numbers contributing to a lack of statistical reliability; (2) small numbers leading to confidentiality concerns; (3) concept validity and data standards, and (4) data validation The paper concludes with a summary and conclusions A glossary of key terms appears in Appendix C BACKGROUND Purposes, users, and benefits of interactive data systems Interactive web-based data dissemination systems in public health have been developed for a number of public health assessment uses One common use is to facilitate the preparation of community-level health profiles Such reports are consistent with Healthy People 2010 (DHHS, 2000), and are increasingly common at the local/county level In some states, they are required This movement reflects the changing mission of public health from direct delivery of personal health care services to assessment and policy development (IOM, Statistical Issues in Web-Based Public Health Data Systems 1996, 1997) The reports are used for planning and priority setting as well as for evaluation of community-based initiatives Minnesota, for instance, will use its interactive dissemination system to reshape the way that state and county health departments basic reports by facilitating, and hence encouraging, the use of certain types of data The system is intended to provide better and more current information to the public than is available in the current static system, in which data are updated only every two years From another perspective, the purpose of web-based dissemination systems is to enable local health officials, policy makers, concerned citizens, and community leaders who are not trained in statistics or epidemiology to participate in public health decision-making Because many of these users are not experienced data users, some systems are designed to help users find appropriate data MassCHIP, for instance, was designed with multiple ways into datasets so users are more likely to “stumble upon” what they need Users can search, for instance, using English-language health problems lists and Healthy People objectives, as well as lists of datasets Web-based dissemination systems are also a natural outgrowth of the activities of state health data centers The systems allow users to prepare customized reports (their choice of comparison areas, groups of ICD codes, age groups, and so on) So in addition to making data available to decision makers and the public, they also facilitate work already done by state and local public health officials and analysts This includes fulfilling data requests to the state data center as well as supporting statistical analyses done by subject area Statistical Issues in Web-Based Public Health Data Systems experts States have seen substantial reduction in the demand on health statistics staff for data requests In at least one case, the system itself has helped to raise the profile of the health department with legislators Interactive web data systems are also being used to detect and investigate disease clusters and outbreaks This includes cancer, infectious diseases, and, increasingly, bioterrorism Interactive web systems are also being used, on a limited basis, for academic research, or at least for hypothesis generation The software that runs some of these systems (as opposed to the state health data that are made available through it) has also proven useful for research purposes Nancy Krieger at the Harvard School of Public Health, for instance, is using VistaPH to analyze socio-economic status data in Massachusetts and Rhode Island, and others are using it in Duval County, Florida and Multnomah County, Oregon Some states are also building web-based systems to bring together data from a number of health and social service programs and make them available to providers in order to simplify and coordinate care and eligibility determination Such systems can provide extremely useful statistical data, and in this sense are included in this analysis The use of these systems for managing individual patients, however, is not within the scope of this paper Reflecting the wider range of purposes, the users of web-based data systems are very diverse They include local health officials, members of boards of health, community coalitions, as well as concerned members of the public Employees of state health data centers, other health department staff, and employees of other state agencies; hospital planners and other health service Statistical Issues in Web-Based Public Health Data Systems 40 users with no training in statistics or epidemiology, and build appropriate choices into the system software Data can be analyzed in different ways, and in some cases must be for purposes of consistency The SIDS rate, for instance, can be calculated using births in the year as the denominator (the demographic approach), or by linking individual birth and death records (the epidemiologic approach) Both are valid, but one may be preferable to the other depending on the comparisons that will be made Incidence and prevalence rates, as well as cause specific-mortality rates, can be presented in crude form or standardized to a common population base to clarify comparisons across time and geographical units Whether standardization is done at all, and if so what standard population is used, whether a direct or an indirect approach is used, and the choice of age groups are all matters of judgment Health services researchers frequently “risk adjust” hospital outcomes data to reflect different patient populations How to so, however, is a matter of judgment and of data availability Some of these issues can be addressed through data standards Such standards might address the handling of missing data, methods for age adjustment (e.g minimum numbers for direct adjustment), CDC surveillance case definitions, and race and ethnicity definitions Data standards might also specify common population denominators for all rates in the system National documents and agencies can help guide the choice of data standards These include Healthy People 2010, and especially the Leading Health Indicators, HRSA’s Community Health Status Indicators, CDC’s Behavioral Risk Factor Statistical Issues in Web-Based Public Health Data Systems 41 Surveillance System (BRFSS), and the health care surveys and data systems maintained by the Agency for Healthcare Research and Quality (AHRQ) MassCHIP, for instance, builds data standards and guidelines into the system through meta-data Minnesota is planning to incorporate the emerging National Electronic Disease Surveillance System (NEDSS) standards – developed by CDC for surveillance data systems (CDC, undated) – into its state system The NEDSS standards currently focus on details of data transmission and other technical issues rather than statistical standards, as described in the previous paragraph, but in the future NEDSS standards might expand to include more statistical issues Ultimately, the development of a web-based dissemination system drawing on data from various parts of a state health department and other agencies can provide an opportunity to develop or revise necessary standards While helpful, however, data standards alone are unlikely to ensure that data are used properly Indeed, some system designers seem concerned that providing too much data to non-sophisticated users might lead to misuse or misinterpretation As Sandra Putnam noted, members of the local health councils are generally not trained in health assessment; they consist of mayors, police chiefs, physicians and nurses The University of Tennessee, Knoxville therefore has offered training in the use of the HIT system They try to teach people about age adjustment, combining data over three to five years, but had to spend a considerable amount of time simply helping people use the web itself In Statistical Issues in Web-Based Public Health Data Systems 42 Washington, an independent epidemiologist-demographer consultant uses VistaPH to help counties with less sophistication prepare health profiles David Solet in Washington notes that users need sophistication in designing assessment studies and interpreting the results, not in operating the system Much of the training for VistaPH has focused on basic epidemiology and using Excel to make graphs Washington has also posted “Assessment guidelines” on the web (Washington State Department of Health, 2001) Richard Hoskins notes, similarly, the issue is not the technology but being sure that people know how to use it properly Hoskins therefore does training on proper ways to disease mapping, not just on how to use the EpiQMS to it One approach to dealing with users who are not trained in public health assessment is to develop standard reports for the web-based data dissemination system MassCHIP, for instance, has “Instant Topics” reports for Healthy People 2010, Healthy Start, minority health, and others topics Standard reports of this type not only make the system easier to use, but also ensure that users employ appropriate and comparable variables for their community health profiles Users might also take these reports as models for other topics Massachusetts is also developing “wizards” to help occasional users Some states use a peer education approach Minnesota, for instance, has an epidemiology users group that meets regularly This group, which includes epidemiologists from a variety of program areas, has helped the state adopt the 2000 standard population and deal with other data policy issues The group has also given guidance regarding the development of the state’s web-based dissemination system and its members train one another in the system’s use Statistical Issues in Web-Based Public Health Data Systems 43 A number of the state web-based dissemination systems are currently being used as part of the curriculum in schools of public health and other academic institutions In addition to training future users in the use of the system and in public health assessment techniques more generally, presentations to colleges and universities help to raise awareness about the system itself Most users, however, will not have the opportunity for such training Webbased dissemination systems, therefore, incorporate various approaches to training and documentation These include on-line tutorials, context-sensitive help screens, and help desks Web-based dissemination systems also point users to external training material at the National Center for Health Statistics (NCHS, 1999), the Centers for Disease Control and Prevention, professional organizations and universities Tennessee’s HIT system does age adjustment, and its associated educational efforts focus on when and why rather than how The system also provides a variety of charts, as well as user-defined comparisons and ranking tools, which are seen as a way to lead users to proper analyses The HIT system does not use more sophisticated models because its developers feel that the system’s users are more comfortable with “real” counts and rates and are not sure how the users would interpret the results As is common in other states, Tennessee users often want only the data for their own area, so the system does not merge data from adjacent areas Statistical Issues in Web-Based Public Health Data Systems 44 Conclusions and recommendations Ever since computers were introduced, statisticians have been concerned that the availability of data and statistical software would lead untrained users to make mistakes While this is probably true to some extent, restricting access to data and software is not likely to succeed in public health Introduction of interactive web-based dissemination systems, on the other hand, should be seen as an important opportunity to develop and extend data standards in public health data systems Educational efforts building on the dissemination software itself, as well as in more traditional settings, are likely to be more effective in reducing improper use of data than restricting access For many users, such training will need to include content on using public health data, not just on using web-based systems The development of standard reports for web-based systems can be an effective means for disseminating data standards Data validation Before public health data are published in printed form, subject matter and statistical experts review the tables and charts to ensure that errors or inconsistencies in the data are found and corrected When web-based dissemination systems create analyses that have not gone through this process, embarrassing errors can occur This is especially true when web-based systems allow for more geographical detail than is otherwise published Suppose, for instance, that data are available by Zip code Transposition or geocoding errors might assign a small number of cases from a large city to a rural area with a Statistical Issues in Web-Based Public Health Data Systems 45 similar Zip code Errors of this sort would not be noticed in state-level analyses, or in the large city In areas with small population and few deaths, however, the addition of one or two miscoded cases would, in relative terms, be a major error Errors in web-based data dissemination systems are of two types First, there are errors in the raw data and in any measures that are based on them Second, data systems themselves can introduce errors in processing or statistical analysis Validation approaches can address one or both of these sources Errors occur because data systems make at least three transformations: (1) from individual records to statistical variables for different geographical areas, demographic groups, and so on (2) from variables as originally recorded to recoded measures that are more comparable across datasets and more suitable for data analysis and (3) from one data definition to another to account for differences in data standards from year to year in the same datasets Processing of missing and unknown values can also lead to problems State health data centers have taken a variety of approaches to data validation MassCHIP, for instance, strives for “100% validation.” Every time a small change is made in the system, its developers test for unintended changes elsewhere This requires many hours of trying the system on real data, comparing to previous reports (published and unpublished), and looking for suspicious results Substance experts are also involved in validation efforts In Washington, there are three to four people in the department with their own programs, and they compare their results “all the time” to make sure that they get the same results as VistaPH They use a “re-extraction” process that looks back Statistical Issues in Web-Based Public Health Data Systems 46 over 20 years of data Similarly, before the Tennessee system is opened to the public, it is checked against every available printed report (published or inhouse) Better documentation of what is already done would be useful to other states, whether they currently have web-based dissemination systems or are developing them Another approach is to use the same dataset for every purpose In Washington, for instance, the state health data center releases the same data to VistaPH and EpiQMS as to NCHS and others, so it benefits from multiple internal edits Because there are so many users, many people look at these data and urge them to make corrections when needed Validation efforts of this sort, however, can only detect errors due to processing or statistical calculations, since every system has the same raw data Conclusions and recommendations No statistical techniques can guarantee that there will be no errors in webbased data systems Careful and constant checking of both the data and the dissemination system, as well as a policy of releasing the same data files to all users, however, can substantially reduce the likelihood of errors Methods for validation should be documented and shared among states The development of web-based dissemination systems is an opportunity to implement data standards rather than a problem to be solved Efforts to check the validity of the data for web dissemination purposes may actually improve overall data quality in state public health data systems Statistical Issues in Web-Based Public Health Data Systems 47 CONCLUSIONS Web-based systems offer state health data centers an important opportunity to disseminate data to public health professionals, local government officials, and community leaders, and in the process raise the profile of health issues and involve more people in community-level decision making Web-based dissemination systems, because they require that multiple data systems be put into a common format, present opportunities to disseminate appropriate data standards and thereby unify state data systems The work required to validate the data in the systems can also result in better overall quality data in the state system The primary statistical concerns with web-based dissemination systems relate to small numbers In particular, data for small population groups can be lacking in statistical reliability, and also can have the potential for releasing confidential information about individuals These concerns are present in all statistical publications but are more acute in web-based systems because of their focus on presenting data for small geographical areas Data suppression can resolve some of these problems but results in a significant loss of information Aggregation, whether advised or automatic, is preferable but also results in a loss of information Formal statistical methods (confidence intervals, significance tests, geographical and model-based smoothing, and other methods for small area statistics) allow the maximum amount of information to be disseminated while at the same time honestly communicating to users about statistical reliability and protecting the confidentiality of individual health data While Statistical Issues in Web-Based Public Health Data Systems 48 models of this sort exist in other areas, their application in public health has been limited and should be further explored The further development and use of web-based data dissemination systems will depend on a good understanding of the systems’ users and their needs System designers will have to balance between enabling users and protecting them from themselves Systems will also have to develop ways to train users; not only in how to use the systems themselves, but also on statistical issues in general and the use of public health data Research to develop and implement new statistical methods, and to better understand and address users’ needs, is a major investment Most states not have the resources to this on their own Federal agencies, in particular through CDC’s Assessment Initiative, could help by enabling states to share information with one another, and by supporting research on the use of new statistical methods and on data system users Statistical Issues in Web-Based Public Health Data Systems 49 REFERENCES Anderson RN, Rosenberg HM, 1998 Age Standardization of Death Rates: Implementation of the Year 2000 Standard National Vital Statistics Reports, 47; 20 pp (PHS) 98-1120 Centers for Disease Control and Prevention, undated Behavioral Risk Factor Surveillance System [www.cdc.gov/brfss/index.htm] CDC, undated Supporting public health surveillance through the National Electronic Disease Surveillance System (NEDSS) [www.cdc.gov/od/hissb/docs/NEDSS%20Intro.pdf] Cohen BB, 2001 Guidelines for the release of aggregate statistical data: Massachusetts perspective on issues and options Presentation at the Assessment Initiative/NAPHSIS Conference, September 12, 2001 Department of Health and Human Services, 2000 Healthy People 2010: Understanding and Improving Health Washington: Government Printing Office Devine OJ, and Louis TA, 1994 A constrained empirical Bayes estimator for incidence rates in areas with small populations Statistics in Medicine 13: 111933 Doyle P, Lane JI, Theeuwes JJM, and Zayatz LM, eds., 2001 Confidentiality, Disclosure, and Data Access: Theory and Practical Application for Statistical Agencies Amsterdam: Elsevier Science BV Duncan GT, Fienberg SE, Krishnan R, Padman R, Roehrig SF, 2001 Disclosure limitation methods and information loss for tabular data Chapter in Confidentiality, Disclosure, and Data Access: Theory and Practical Application for Statistical Agencies, Doyle P, Lane JI, Theeuwes JJM, and Zayatz LM, eds Amsterdam: Elsevier Science BV Duncan GT, 2001 Confidentiality and statistical disclosure limitation In International Encylopedia of the Social and Behavioral Sciences (cited in Duncan et al., 2001) Federal Committee on Statistical Methodology, 1994 Statistical Policy Working Paper 22 – Report on Statistical Disclosure Limitation Methodology Washington: Statistical Policy Office, Office of Management and Budget Fienberg SE, Makov UE, Steele RJ, 1998 Disclosure limitation using perturbation and related methods for categorical data Journal of Official Statistics 14: 485-502 Statistical Issues in Web-Based Public Health Data Systems 50 Friedman DJ, Anderka M, Krieger JW, Land G, Solet D, 2001 Accessing population health information through interactive systems: Lessons learned and future directions Public Health Reports 116: 116-147 Gostin LO, 2001 National Health Information Privacy: Regulations Under the Health Insurance Portability and Accountability Act JAMA 285: 3015-3019 Health Resources and Services Administration (HRSA), 2000 Community Health Status Indicators Project [www.communityhealth.hrsa.gov] Institute of Medicine (IOM), 1996 Healthy Communities: New Partnerships for the Future of Public Health, Stoto MA, Abel C, and Dievler A, eds Washington: National Academy Press IOM, 1997 Improving Health in the Community: A Role for Performance Monitoring, Durch JS, Bailey LA, and Stoto MA, eds Washington: National Academy Press Land G, undated Confidentiality data release rules Metropolitan Washington Public Health Assessment Center, 2001 Community Health Indicators for the Washington Metropolitan Region National Center for Health Statistics, undated Web-Based Resource Center Work Group Suggested Web Sites for Health Data Standards [www.cdc.gov/nchs/otheract/phdsc/wbasedwg_sites.htm] NCHS, 1999 Public Health: Our Silent Partner [www.cdc.gov/nchs/products/training/phd-osp.htm] National Research Council (NRC), 2000a Small-Area Estimates of School-Age Children in Poverty: Evaluation of Current Methodology Citro CF and Kalton G, eds Washington: National Academy Press NRC, 2000b Small-Area Income and Poverty Estimates: Priorities for 2000 and Beyond Citro CF and Kalton G, eds Washington: National Academy Press Pickle LW, Mungiole M, Jones GK, White AA, 1996 Atlas of United States Mortality DHHS Publication No (PHS) 97-1015 Hyattsville, MD: U.S Department of Health and Human Services Shen W, Louis TA, 1999 Empirical Bayes estimation via the smoothing by roughening approach J Computational and Graphical Statistics, 8: 800-823 Shen W, Louis TA, 2000 Triple-goal estimates for disease mapping Statistics in Medicine, 19: 2295-2308 Statistical Issues in Web-Based Public Health Data Systems 51 Sweeney L, 1997 Weaving technology and policy together to maintain confidentiality J Law Med Ethics, 25(2-3): 98-110 Sweeney L, 1998 Privacy and medical-records research NEJM, 338(15):1077 UCLA Center for Health Policy Research, 2002 California Health Interview Survey Fact Sheet [www.healthpolicy.ucla.edu/publications/CHIS_Fact_Sheet_A.pdf] U.S Department of Health and Human Services (DHHS), 2000 Healthy People 2010: Understanding and Improving Health Washington: Government Printing Office Washington State Department of Health, 2001 Data guidelines [www.doh.wa.gov/Data/Guidelines/guidelines.htm] Zaslavsky AM, Horton NJ, 1998 Balancing disclosure risk against the loss of nonpublication Journal of Official Statistics 14: 411-419 Statistical Issues in Web-Based Public Health Data Systems 52 Appendix A: Individuals who were Interviewed or Commented on the First Draft and State Websites Massachusetts Daniel Friedman, Assistant Commissioner, Bureau of Health Statistics, Research and Evaluation Washington John Whitbeck, Research Manager, Center for Health Statistics Marlene Anderka, Director, Office of Statistics and Evaluation David Solet, Assistant Chief Epidemiologist MassCHIP: Massachusetts Community Health Information Profile (masschip.state.ma.us) James Allen, Senior Systems Analyst Minnesota John Oswald, Director, Center for Health Statistics Peter Rode Steve Ring Richard Hoskins, Public Health Geographer & Senior Epidemiologist EpiQMS (www5.doh.wa.gov/epiqms) VistaPH (www.doh.wa.gov/OS/Vista/HOMEP AGE.HTM) Richard Fong Missouri Garland Land, Director, Center for Health Information Management & Evaluation Norma Helmig, Chief, Bureau of Health Resources Statistics Eduardo Simoes, State Epidemiologist MICA: Missouri Information for Community Assessment (www.health.state.mo.us/MICA/ nojava.html) Tennessee Sandra Putnam, Director, Community Health Research Group, University of Tennessee Knoxville HIT: Health Information Tennessee (hitspot.utk.edu) Others Bruce Cohen Massachusetts Department of Public Health Daniel Goldman, System Developer Expert Health Data Programming, Inc VitalNet ((http://www.ehdp.com/vitalnet/) John Paulson Center for Public Health Data and Statistics Ohio Department of Health Statistical Issues in Web-Based Public Health Data Systems 53 Appendix B: Review Meeting Participants Harvard School of Public Health Boston, Massachusetts August and 7, 2002 Marlene Anderka Office of Statistics and Evaluation Massachusetts Department of Public Health Susan Elder Center for Health Information Management and Evaluation Missouri Department of Health and Senior Services Dan Freidman Assistant Commissioner Massachusetts Department of Public Health Tim Green Division of Public Health Surveillance & Informatics Epidemiology Program Office Centers for Disease Control and Prevention Patricia Guhleman Bureau of Health Information Division of Health Care Financing Wisconsin Department of Health and Family Services Stepahnie Haas School of Information and Library Science University of North Carolina Ken Harris National Center for Health Statistics, Centers for Disease Control and Prevention Steve Lagakos Department of Biostatistics Harvard School of Public Health Luis Paita National Association of Health Data Organizations Sandra Putnam Community Health Research Group University of Tennessee-Knoxville Jamie Ritchey National Association of Public Health Statistics and Information Systems Tim Stephens National Association of Public Health Statistics and Information Systems Michael Stoto RAND and Department of Biostatistics Harvard School of Public Health Neil Thomas East Tennessee State University Al Zarate National Center for Health Statistics Alan Zaslavsky Department of Health Care Policy Harvard Medical School Statistical Issues in Web-Based Public Health Data Systems 54 Appendix C: Glossary Complementary suppression: Suppression of cells in the same row or column of a small cell to avoid discovery of the number of cases by subtraction Confidence interval: A statistical interval based on a statistical sample calculated in such a way that it will include the value of the populations statistic being estimated with a certain likelihood, usually 95 percent The confidence interval can be interpreted as a range of values that we are reasonably confident contains the true (population) [proportion] “or parameter” - TG Confidentiality: The ability (or inability) to identify the individuals, small groups of individuals, or other entities represented in a database and use information in the database to discover their characteristics that would otherwise not be known Geographic smoothing: Statistical technique to provide more reliable estimates for small areas, based on the assumption that geographically proximate areas have similar health outcomes Hierarchical Bayesian modeling: Statistical technique to provide more reliable estimates for small areas, based on the assumption that non-geographic factors such as socioeconomic status are related to health outcomes Hypothesis tests: Statistical technique to determine whether differences (between groups or over time) are due to chance Microdata: Individual-level data Sampling variability: Uncertainty in statistical estimates based on a random sample due to the sampling itself, i.e that repetitions of the same sampling process would yield slightly different results due to random selection Standardization: Statistical adjustment to reflect differences (usually in the age and sex distribution) between two populations being compared Stochastic variability: Uncertainty in statistical estimates due to natural variability in the process being measured For instance, even though two communities may have the same, unchanging conditions that affect mortality, and cases would be expected in each community every year, the actual number in any given year could be and 8, and 7, and so on, simply due to chance .. .Statistical Issues in Interactive Web-based Public Health Data Dissemination Systems EXECUTIVE SUMMARY State- and community-level public health data are increasingly being made available... improve overall data quality in state public health data systems Statistical Issues in Web-Based Public Health Data Systems 47 CONCLUSIONS Web-based systems offer state health data centers an... statistical issues apply to public health data however published, but interactive web-based systems make certain issues more acute In addition, identifying and addressing these issues for interactive systems

Ngày đăng: 18/02/2014, 21:20

Từ khóa liên quan

Tài liệu cùng người dùng

  • Đang cập nhật ...

Tài liệu liên quan