Cross National Time Series Data Archive User’s Manual

June 8, 2024
Cross-National

CROSS-NATIONAL TIME-SERIES DATA ARCHIVE USER’S MANUAL

Introduction

The Cross-National Time-Series Data Archive (CNTS) was launched by Arthur S. Banks in the fall of 1968 at the State University of New York at Binghamton. The archive was, in part, the outcome of an effort initiated some years earlier to assemble, in machine readable, longitudinal format, certain of the aggregate data resources of The Statesman’s Yearbook, an annual with a history of continuous publication since 1864, which had never been systematically mined for quantitative materials of potential utility for comparative social scientists. However, many of the data extracted from this source proved to be of questionable reliability (particularly for the earlier years) and a large number of additional sources were ultimately consulted (see Sources and Source Identification, below).
In establishing the archive, it was decided to assemble materials dating, insofar as possible, from 1815 (immediately after the Congress of Vienna and formation of the modern international system). It was also decided that all commonly recognized members of the international community would be represented, excluding a handful of quasi-states such as Andorra, Liechtenstein, Monaco, and Vatican City. In 1977, data for the latter were also introduced, with coverage extending from 1975.
The original file was punched and stored on IBM cards, but these quickly became too numerous for efficient utilization and, in the fall of 1969, were abandoned in favor of tape storage, for which various update, listing, and extraction procedures were concurrently developed.
In January 1971, 102 of the archive’s variables were presented in a volume entitled Cross-Polity Time-Series Data (M.I.T. Press). For some years thereafter, magnetic tape copies of the file were distributed from Binghamton. Internet access was initiated in December 1997.
Updating the file lagged somewhat in the two decades prior to the compiler’s retirement in 1996, but has since been accelerated, with most variables relatively current as of mid-2007, save for a few (such as Telegraph Mileage) whose measurement is now of little relevance, or others (such as Urbanization in smaller cities) for which data is no longer available.
The problem of missing data has been addressed as follows. Short-term gaps between “Hard data” entries (signified by alphabetic entries in field location 9), are remedied by means of an inverse compound interest procedure, save for some of the early population data for which simple averaging was employed.
Given the wide variety of sources, varying degrees of reliability are to be expected. The file is, however, an open one, and corrections are constantly being made as they become known to the compiler. The structure of the archive, its content, coding criteria and sources (as of November, 2007) are detailed below.

STRUCTURE OF THE ARCHIVE

The archive has 194 variables and contains data for over 200 country units, with provision for entries from 1815 to 2006 (excluding the two modern wartime periods, 1914-1918 and 1940-1945). The basic structure of the archive is that of a rectangular matrix of periodically augmented records, each encompassing data for one country-year.

STRUCTURE OF THE DATA

The data is contained in the file, “CNTSDATA.xls”, and may be categorized in a variety of ways. First, all of the variables currently included in the file are longitudinal, rather than cross-sectional, in character. The temporal spans of the arrays vary, of course, depending on the availability of data and the relevance of an indicator at a given point in time. To cite the obvious, one would not expect to find telephone data for the first threequarters of the nineteenth century; less obvious, perhaps, is the general lack of telegraph mileage data after 1939–attributable largely to the decline in relevance of the telegram as a means of communication in the contemporary era. Series terminated for reason of either source availability or relevance have the year of termination shown in the file, “Codebook.xls”.
Second, the overwhelming proportion of the data are interval-scaled, that is to say, expressed in true numeric units, be they dollars, miles, or what have you. The only ordinal-scaled data (ranked on a “more” or “less” basis without the implication of true numeric units) are certain of the political items in Legislative Process Data and Political Data. Only four variables, Type of Regime (polit01), Head of State (polit05), Premier (polit06) and Effective Executive (Type) (polit07) are nominal-scaled (ranked by qualitative category rather than on a “more”/”less” basis). While a variety of techniques have been developed for relatively sophisticated analysis of noninterval data, most of the readily accessible multivariate procedures remain regression-based, hence technically requiring an interval level of measurement.
Third, the file contains both primary and secondary (derived) data. The latter are calculated by mathematical manipulation of the primary data, most commonly by conversion of primary variables to per capita or per square mile form in order to achieve inter-nation comparability, and by recasting arrays on the basis of percent annual change.
Finally, most of the archive’s interval-scaled arrays contain both original and estimated data. Each datum referenced in “Bibliography.xls” by a nonnumeric symbol other than an “F” (Urbanization Data only), an “E”, or a “W” is an original entry, either taken directly or derived from an external source. The estimated data, on the other hand, are one of two principal types, depending on whether they were computer generated (as described above) or supplied by the compiler, usually on the basis of indirect evidence contained in the literature (including instances where initial or terminal original data points fall in the periods 1914-1918 or 1940-1945), to remedy obvious discrepancies in report figures due to typographical or other error, or to “smooth” discontinuities resulting from longitudinal changes in external coding criteria. All such entries are referenced by an “E”. Finally, a limited number of less reliable estimates (identified by a “W”) are also included. These “working estimates” were originally inserted for analytic purposes under circumstances where missing data could not be tolerated, and should be viewed with extreme caution, particularly where they are used as bases for computer generated estimates.
An “F” serves one of two purposes. As used in conjunction with Urbanization Data largely in Population, Cities of 25,000 & Over (urban05) and Population, Cities of 20,000 & Over (urban07) it indicates entries calculated according to a proportional estimation procedure described in Arthur S. Banks and David L. Carr, “Urbanization and Modernization: A Longitudinal Analysis,” Studies in Comparative International Development, 9 (Summer, 1974), 26-45. Elsewhere it serves as a normal reference (see Sources and Source Identification, below).

VARIABLE DEFINITIONS AND CODING CRITERIA

The variable names, definitions and coding criteria are discussed below, all of which are summarized in “Codebook.xls”.

Identification Data

Three fields are used exclusively for identification purposes: year, code, and country. For a list of the country codes and country labels, see the file, “Independent States Since 1815.xls”.
Each country has a unique country code. Not all of the country labels are, however, invariant through time. Alternative labels are utilized, as follows, for the periods indicated:

Austrian Empire for Austria-Hungary, 1815-1866
Dahomey for Benin, 1960-1974
Upper Volta for Burkina Faso, 1960-1983
Khmer Republic for Cambodia, 1971-1974
Kampuchea for Cambodia, 1975-1989
Central African Empire for Central African Republic, 1976-1978
Republic of China for China, 1912-1948
Congo (Kinshasa) for Congo Democratic Republic, 1960-1963
Zaire for Congo Democratic Republic, 1971-1996
Congo (Brazzaville) for Congo Republic, 1960-1970
Ivory Coast for Cote d’Ivoire, 1960-1984
Santo Domingo for Dominican Republic, 1844-1921
United Arab Republic for Egypt, 1958-1960
Abyssinia for Ethiopia, 1898-1935
Persia for Iran, 1815-1913
Malagasy Republic for Madagascar, 1960-1970
Federation of Malaya for Malaysia, 1957-1962
Burma for Myanmar, 1948-1988
Yugoslavia for Serbia and Montenegro, 1919-2002
Ceylon for Sri Lanka, 1948-1970
Tanganyika for Tanzania, 1961-1962
Siam for Thailand, 1815-1913
Ottoman Empire for Turkey, 1815-1913
Russia for USSR, 1815-1913
Yemen for Yemen Arab Republic, 1921-1961
South Yemen for Yemen PDR, 1967-1969
Rhodesia for Zimbabwe, 1965-1979

Area and Population Data

Population Density (pop2) is calculated directly from Area in Square Miles (area1) and Population (pop1), while Population Density of Empire (pop4) is calculated directly from Area of Empire in Square Miles (area3) and Population of Empire (pop3). Area in Square Kilometers (area1) or Area in Square Miles (area2) is converted from one to the other on the basis of the factors .3861 (from K2 to M2) and 2.590 (from M2 to K2). As in a limited number of other original data fields (identified below), where an unusually large number of individual sources were consulted, no bibliographic references are provided for most of the area data. A substantial portion of the latter for the earlier years were, however, derived from the Almanach de Gotha, the Journal of the Royal Statistical Society (London), and The Statesman’s Yearbook.
Area and population of empire data are provided for only 13 countries: Austria-Hungary, Belgium, France, Germany, Italy, Japan, Netherlands, Portugal, Russia, Spain, Turkey (Ottoman Empire), United Kingdom, and United States, thus omitting a few marginal cases, such as the dual monarchies of Denmark-Iceland (to 1944) and Sweden Norway (to 1905). For the Austro- Hungarian, Ottoman, and Russian Empires, the core territories and imperial domains are contiguous; hence the data in fields area3, pop3, and pop4 duplicate those in fields area1, area2, and pop1, respectively. The other ten countries are more conventionally identified as “colonial” powers, most of whose possessions are noncontiguous “overseas” territories.

Urbanization Data

All fields give aggregate population figures for cities in the following categories: 100,000 and over, 50,000 and over, 25,000 and over, 20,000 and over, and 10,000 and over. Thus, Population, Cities of 50,000 & Over (urban03) includes cities of 100,000 and over (urban01), and so forth. Per capita data for the same classes of cities are also provided. Most of the externally derived data entries are compiler summations from the sources cited.
The inclusion of data for cities of 20,000 and over as well as for cities of 25,000 and over was originally mandated by a lack of uniformity in reporting categories in the sources utilized. Subsequent to preparation of the original version of the file, however, a series of missing data estimates, proportionally calculated across urbanization categories, was developed. The procedure for calculating these entries (identified by an “F”) is discussed in Banks and Carr, op. cit.
In assembling the urbanization data, considerable difficulty was encountered in regard to the definition of “city” or “urban area”. Insofar as possible, data for core cities or urban areas are employed, excluding greater metropolitan or suburban populations. It cannot be claimed, however, that the reliability problem is completely surmounted. Indeed, in some cases what UN sources term “municipios” (encompassing rural areas surrounding an urban center) are the only aggregations referenced. Such aberrations, when known, are identified by an “H”.
Given the accelerated rate of global urbanization and an increasing dearth of data for smaller-sized localities, most summations for cities fewer than 100,000 have been truncated at 1980. Exceptions are countries with no cities of 100,000 or more; in these cases, lesser categories have been retained.

National Government Revenue and Expenditure Data

National Government Revenue and Expenditure (revexp1) is calculated directly from National Government Revenue (revexp3) and National Government Expenditure (revexp5). National Government Revenue and Expenditure Per Capita (revexp2) is a dependent (calculated) field based on National Government Revenue and Expenditure (revexp1).
National government revenue and expenditure data is reported exclusive of “extraordinary” expenditures financed by direct foreign aid or loans. revexp4 and revexp5 contain the same items on a per capita basis. revexp7 contains the ratio of national defense expenditure to total national expenditure. The term “national government” should be construed as referring exclusively to centraI government. Thus, monies collected and dispersed locally by national government agencies (as in certain unitary systems) are, wherever possible, excluded.
Revenue and expenditure data, particularly when expressed, as here, in U.S. dollar equivalents, are particularly susceptible to error and should be used with appropriate caution. The possibility of error could, of course, have been substantially reduced had conversion to a common currency unit not been attempted, but the resultant lack of comparability would severely limit the utility of the data in question.
Prior to 1973, official rates of exchange were employed only when deviations therefrom were presumed to be minimal. Otherwise, free (occasionally black) market rates were employed, except in cases of such extreme fluctuation as to preclude the assembly of meaningful series. Needless to say, the overwhelming proportion of data omitted for this reason occurs in the 1919-1939 period.
Since the British pound sterling was the principal basis of international exchange prior to World War I, most data for the period were assembled accordingly, then converted into dollar equivalents at the rate of 4.87 dollars per pound. Some data for 1919-1939 and most data for the post-World War II period were assembled by means of direct conversion to dollar equivalents. It should be noted that here, as elsewhere, there are no “base- year” figures; in other words, there is no adjustment for inflation/deflation in either the British pound (before 1919) or the U.S. dollar (after 1919).
Since 1973 IMF average period market rates have been utilized wherever feasible.

Trade Data

All trade data is exclusive of transshipments and bullion transfers. Trade1 and trade3 contain import and export data respectively, while trade2 and trade4 contain the same items on a per capita basis. Both imports and exports are f.o.b.
Trade5 is a periodic update of the proportion of world trade (imports and exports) for each country for each year. Since the denominator employed is simply a summation of imports and exports for all independent nations included in the archive, it falls somewhat short of being a total summation of world trade. It may be assumed, however, that the proportion contributed by nonindependent territories for most years is relatively small. As in the case of revenue and expenditure data, conversion to U.S. dollar equivalents involves a certain degree of risk as regards the introduction of error, but without such conversion the data would be largely worthless for comparative purposes.

Energy Data

Energy production and consumption are provided for these variables. Energy1 and energy2 contain data on overall energy production and consumption, respectively, as measured through 1992 in metric tons of coal equivalent and from 1994 in metric tons of oil equivalent. The shift from coal to oil equivalents was necessary because of a shift by the UN Statistical Office, whose figures are utilized; standardization is achieved by using conversion factors of .700 for coal to oil and by 1.43 from oil to coal. Energy3 and energy4 contain the same items in kilograms per capita (see listings in “Codebook.xls”)

Military Data

National Defense Expenditure (military1) is calculated from National Government Expenditure (revexp5) and the ratio National Defense Expenditure/National Government Expenditure (revexp7). While deriving the data in this way unquestionably results in some loss of precision, it was not considered sufficiently consequential to offset the added labor required to assemble collateral data directly from external sources.
Military2 contains military1 data in per capita form.
Military3 is the size of military, while military4 contains the same information on a per capita basis. The “military” is defined as embracing all active-duty members of a nation’s armed forces (army, navy, air corps) and excludes all semi- or paramilitary forces, save in a limited number of cases (such as Japan and Panama) where, for some or all reporting years, military establishments are not formally acknowledged. In the case of Switzerland, which does not maintain a continuously active military establishment, estimates of active-duty reserves are utilized.

Industrial and Labor Force

Industry1 is the Percent GDP Originating in Industrial Activity, while industry2 is the same information on a per capita basis. “Industrial activity” is defined as embracing categories 2-4 of the revised (1958) International Standard Industrial Classification of all Economic Activities (ISIC), which includes mining and quarrying; manufacturing; and electricity, gas and water.

Industry3, industry4 and industry5 contain percent workforce engaged in agriculture, industry, and other activity, respectively. “Industry” is here defined as embracing revised ISIC categories 2-3 and 5, which include mining and quarrying; manufacturing; and construction, while “agriculture” is defined in terms of revised ISIC category 1, which includes agriculture, forestry, and fishing. “Other activity” is simply the sum of the foregoing subtracted from 100%.
It should be noted that some sources report on “civilian labor force employed”, while others report on “number of employees” (based on statistics of establishments). The latter normally encompass only a limited portion of the labor force and, for that reason, have not been utilized.

Railroad Data

Railroad1 embraces railroad mileage, defined as miles of line (both public and private), rather than as miles of track. Thus, ten miles of a single track line would be counted as equal to ten miles of double track line. Tramway (e.g., streetcar) and lift lines are excluded, but not cog railways if of a non-tramline character. Railroad2 contains the same data on a per square mile basis.
Railroad3 and railroad4 deal with rail passenger-miles and rail passenger kilometers, respectively, the first being a calculated variable derived from the second. These data refer, of course, to the sum of miles or kilometers traveled by each individual rail passenger. Similarly, railroad5 and railroad6 are based on rail-ton miles and rail ton kilometers, respectively, of freight carried. Railroad7 records rail ton miles per capita.
Given the recent decline in importance of rail transportation, all of the series in this segment are terminated as of 1981.

Highway Vehicle Data

Vehicle1 and vehicle3 are based on the total number of passenger and commercial vehicles, respectively, while vehicle2 and vehicle4 contain the same two items in per capita form. Vehicle5 (all highway vehicles) is the sum of vehicle1 and vehicle3, while vehicle6 is based on all highway vehicles per capita. Motorcycles and motorized construction equipment are excluded from these categories. Taxis (though technically “commercial vehicles”) are counted as passenger cars. Buses, vans, lorries, etc., are all classified as commercial vehicles, even though some may be privately owned and not used for commercial purposes.

Phone Data

Phone1 is a summation phone2 and phone3 data, thus referencing all telephones, including cellular. The number of telephones and telephones per capita are located in phone3 and phone4, and, for many years, exhibit a high degree of reliability because of their ultimate source: the reasonably accurate local telephone directory. It should be noted, however, that there is some likelihood of underreporting in the early years of telephonic communication, when a disproportionate number of instruments were owned or operated by private businesses and government offices. An equally serious source of underreporting stems from contemporary reports based on number of lines, which may service a number of instruments.
Phone3 and phone4 exclude mobile cellular telephones. The number of such instruments (since 1989) is given in phone2, while (as noted above) the total number of telephones, both cellular and noncellular, is given in phone1.
Phone5 and phone6 contain dependent (calculated) telephone entries: mobile cellular telephones per capita and all telephones, including cellular, per capita.

Telegraph Data

Telegraph1 deals with telegraph mileage, defined as miles of line (both public and private), rather than as miles of wire. Telegraph mileage per square mile is given in telegraph2. Telegraph3 and telegraph4 contain the number of telegrams and telegrams per capita, respectively. For these entries, every effort has been made to report purely domestic telegraphic activity, excluding foreign sent and received, as well as in-transit messages. However, in some case (particularly in the pre-World War I period) the sources do not adequately distinguish between the several message categories, and occasional over-reporting may be expected. The result is a serious reliability problem for certain Latin American countries during the latter years of the nineteenth century, when an unusually high proportion of telegrams fall into the foreign-sent and foreign-received categories.
Since virtually no data for telegraphic mileage could be located for the post World War II, entries for telegraph1 and telegraph2 are discontinued after 1939, while entries for telegraph3 and telegraph4 terminate in 1980.

Mail Data

Mail1 contains first class mail and mail2 first class mail per capita. Mail3 is all letterpost mail, while mail4 contains all letter- post mail per capita.
As in the case of telegraphic communication, the coding criteria call for the exclusion of foreign sent/received and in-transit items, although in some case where official government figures are used, at least some foreign items appear to be included.
Newspapers carried by mail are included as bona fide (non-first class) postal matter, but since figures for the latter are occasionally lacking, some discrepancies are to be expected in the all-mail category. Post cards are, of course, construed as first class items and prior to World War I constituted a large part of the latter class of mail in many European countries (most notably Germany).
Most of the post-World War II mail figures are from the Universal Postal Union (UPU), which does not distinguish between first-class and all mail. For this reason, the first-class series are, for the most part, terminated as of 1939.

Media Data: Radio and Television Set Data; Newspaper Data; Book Production Data

Media1 and media3 contain data on radio and television sets, respectively, while media2 and media4 deal with the same items in a per capita basis. Media5 is devoted to newspaper circulation per capita, media6 concerns book production by number of titles published, and media7 deals with the latter on a per capita basis. All media data are for comparatively recent years (the earliest, number of radio receivers, goes back only to 1938, while the most recent, television receivers, dates from 1960).
There is a tendency for news circulation to be underreported, since data for weekly and biweekly publications are not included. It should also be noted that book production figures generally include children’s and school text books, and are not restricted to either first edition or hardbound titles. It should be emphasized, however, that the data reference only the number of titles, not copies in print.

School Enrollment Data

School01 and school03 contain data on primary and secondary enrollment, respectively, while school02 and school04 deal with the same items on a per capita basis. School05 aggregates school01 and school03, yielding primary and secondary enrollment, while school06 presents the same data in per capita form. School07 offers primary enrollment as a proportion of primary and secondary enrollment.
Although significant improvement has been registered over the years regarding standardization of reporting categories in educational statistics, many difficulties remain in attempting to assemble truly comparable data, particularly of a longitudinal character. Insofar as possible, data on preprimary, vocational or technical, part-time, and adult education students have been omitted from the archive listings. With these exceptions, every effort has been made to assemble data on the basis of relevant UNESCO criteria:
First level: Education whose main function is to provide basic instruction in the tools of learning (e.g., at elementary school, primary school). Its length may vary from 4 to 9 years, depending on the organization of the school system in each country;
Second level: Education based upon at least four years of previous instruction at the first level, and providing general or specialized instruction, or both (e.g., at middle school, secondary school, high school . . .);
Third level: Education which requires, as a minimum condition of admission, the successful completion of education at the second level, or evidence of the attainment of an equivalent level of knowledge. . . (UN Statistical Yearbook: 1973, p. 781).
Regrettably, the UN criteria for categorizing second-level instruction changed during 1964-65. In general, 1964 “secondary level” figures are equated with 1965 and later “second level: general” figures, but not uniformly so. Also the omission of vocational education introduces an element of bias, particularly in socialist countries, because of the inclusion of many students under this rubric.
School08 and school10 deal with university and total school enrollment, respectively, while school09 and school04 report the same items on a per capita basis.
School12 contains literacy data, calculated, wherever possible, on the basis of nonliterates, 15 years of age and over. Literacy is defined in the UN Demographic Yearbook (from which most of the post-World War II data are extracted) as “ability both to read and to write”. While this is not an entirely adequate definition, it is unrealistic to assume that the caliber of most reporting agencies could sustain a more precise one.
Indeed, for the limited amount of pre-World War I literacy data that is included in the file, overall reliability must be assessed with extreme caution.

Physician Data

Physician1 deals with inhabitants per physician, while its reciprocal (physicians per capita) appears in physician2. The latter is deemed a somewhat more useful cross-national indicator than the former (which appears in the UN Statistical Yearbook), since the direction of the array, for most countries, accords with that of other “developmental” indicators (tending to yield positive rather than negative correlation coefficients).

National Income and Currency Data

Economics1 is devoted to national income per capita, economics2 to gross domestic product (at factor cost) per capita, and economics3 to gross national product (at market prices) per capita. These three basic components of aggregate product are defined as follows:
Gross national product at market prices is the market value of the product, before deduction of provisions for the consumption of fixed capital, attributable to the factors of production supplied by normal residents of the given country. It is identically equal to the sum of consumption expenditure and gross domestic capital formation, private and public, and the net exports of goods and services plus the net factor incomes received from abroad.
Gross domestic product at factor cost is the value at factor cost of the product, before deduction of provisions for the consumption of fixed capital, attributable to factor services rendered to resident producers of the given country. It differs from the gross domestic product at market prices by the exclusion of the excess of indirect taxes over subsidies.
National income is the sum of the incomes accruing to factors of production supplied by normal residents of the given country before deduction of direct taxes. (UN Yearbook of National Accounts Statistics, 1969, v. 1, p. xi.)
The interrelationships of the three aggregates are as follows: GNP at market prices less net factor income from abroad and indirect taxes net of subsidies equals GDP at factor cost. The latter, in turn, less depreciation, plus net factor income from abroad, equals national income (ibid, p. 819). All data for these three indices for the period 1970-1973 are estimated because of definitional changes in 1970, which make aggregate product figures somewhat inconsistent with earlier figures. It should be noted that as of 1999 the World Bank reports GNP as gross national income (GNI).
Largely because of the abandonment by the UN of national income figures in US dollars, the series terminates as of 1985, although data in national currency continues to be reported by the IMF.

Economics4 deals with per capita currency in circulation, expressed in U.S. dollars at the free market rate, save in a limited number of cases where the free rate closely approximates the official rate. Data are from Pick’s Currency Yearbook, whose reports terminated as of 1984.
Economics5 gives the age of a nation’s currency in months. “Age” is defined in terms of the number of months that have elapsed since the introduction of a new monetary system or since an upward or downward revaluation of 5% or more. In cases of multiple revaluations totaling 5% or more during a given year, the count is from the last such revaluation. Because of the general abandonment of artificially pegged and multiple rate systems, the series is discontinued after 1970.
Economics6 gives a nation’s official exchange rate at year’s end, expressed in local currency per U.S. dollar. After 1971 the effective rate (usually the IMF market or principal rate) is used if the official rate is inoperative.
Economics7 gives the free or black market rate in local currency per U.S. dollar, primarily as reported until 1985 by Pick’s Currency Yearbook.

Domestic Conflict Event Data

While no bibliographic references are utilized in connection with these data, most are derived from The New York Times. The eight variable definitions (adopted from Rudolph J. Rummel, “Dimensions of Conflict Behavior Within and Between Nations”, General Systems Yearbook, VIII [1963], 1-50) are as follows:
Assassinations (domestic1). Any politically motivated murder or attempted murder of a high government official or politician.
General Strikes (domestic2). Any strike of 1,000 or more industrial or service workers that involves more than one employer and that is aimed at national government policies or authority.
Guerrilla Warfare (domestic3). Any armed activity, sabotage, or bombings carried on by independent bands of citizens or irregular forces and aimed at the overthrow of the present regime.
Major Government Crises (domestic4). Any rapidly developing situation that threatens to bring the downfall of the present regime – excluding situations of revolt aimed at such overthrow.
Purges (domestic5). Any systematic elimination by jailing or execution of political opposition within the ranks of the regime or the opposition.
Riots (domestic6). Any violent demonstration or clash of more than 100 citizens involving the use of physical force.
Revolutions (domestic7). Any illegal or forced change in the top government elite, any attempt at such a change, or any successful or unsuccessful armed rebellion whose aim is independence from the central government.
Anti-government Demonstrations (domestic8). Any peaceful public gathering of at least 100 people for the primary purpose of displaying or voicing their opposition to government policies or authority, excluding demonstrations of a distinctly anti-foreign nature.
It should be noted that because these data are based on newspaper reports, they are somewhat biased geographically and limited in comprehensiveness. Other distortions are attributable to venues not deemed clearly domestic, such, for example, as the Israel- Palestinian conflict. For these and other reasons, the contents of this segment should be used with extreme caution and, in general, only for macroanalytic purposes.
Domestic9 is used for weighted conflict measures, the specific weights being variable. As of October 2007 the values entered were: Assassinations (25), Strikes (20), Guerrilla Warfare (100), Government Crises (20), Purges (20), Riots (25), Revolutions (150), and Anti-Government Demonstrations (10).

Electoral Data

The percent turnout in the most recent (lower house) legislative election is given in electoral1; electoral2 gives the number of registered voters (in some cases, such as the United States, those eligible to register and vote) for the year in question; electoral4 contains the number of valid votes cast. (For the overall turnout, including those whose ballots were disallowed, electoral2 should be multiplied by electoral1). In situations involving runoff balloting, the figures are based on first-round results.

Legislative Process Data

Legis01 contains the number of seats held by the largest party in the lower house of each country’s national assembly. Legis02 contains the total number of seats in the lower house, except in cases where no parties exist (or did not exist at the last election), where a zero is entered (in such cases, the absence of a legislature is indicated by zero entries in legis03 and legis04. In one-party systems with legislative membership in excess of 999, the latter figure is employed in legis01 and legis02.
Legis03-legis06 contain ordinal-scaled data, coded as follows:
Legis03. Effectiveness of Legislature
(3) Effective
(2) Partly Effective
(1) Largely Ineffective
(0) No Legislature
Legis04. Nominating Process
(3) Competitive
(2) Partly Competitive
(1) Essentially Non-Competitive
(0) No Legislature
Legis05. Legislative Coalitions
(3) More than one party, no coalitions
(2) More than one party, government coalition, opposition
(1) More than one party, government coalition, no opposition
(0) One party or no parties
Legis06. Party Legitimacy
(3) No parties excluded
(2) One or more minor or “extremist” parties excluded
(1) Significant exclusion of parties (or groups)
(0) No parties, or all but dominant parties and satellites excluded

It may be noted that the data in legis03 are substantively similar to the data in polit13, below. The two data sets are not, however, identical. For the earlier years they were coded at different times and incorporated into the file as components of different sub files. In recent years, they tend to converge.
Legis07 is an index of seats held by the largest party, obtained by dividing legis02 by legis01. The principal reason for calculating the index in this manner (rather than as a percentage of seats held) is to ensure that the entries for countries with no parties (or no legislatures) and countries with one-party systems will be adjacent, rather than at opposite extremes of the array. Thus, a country with no parties has a score of 0, a one-party system has a score of 1.0, a system with 40 out of 100 seats held by a majority party has a score of 2.5, etc.
Legis08-legis10 contain secondary data derived from items appearing above. Legis08 is a total of the ordinal scores contained in legis03-legis06 and, as such, may be construed as a simple, nonfactorial, measure of political polyarchy or pluralism. Legis09 contains seven-year averages of the data in legis07, while legis10 contains seven-year totals of the data in legis08.

Political Data

Polit01 is a party fractionalization index, based on a formula proposed by Douglas Rae in “A Note on the Fractionalization of Some European Party Systems”, Comparative Political Studies, 1 (October 1968), 413-418. The index is constructed as follows:
m
F = 1 – Σ (ti ) 2
i=i
where ti = the proportion of members associated with the ith party in the lower house of the legislature (where there are no parties, a zero is entered)
In calculating the Index entries, independents are disregarded and legislative changes between elections are not taken into account. It should also be noted that sources vary on the distribution of seats (and even the overall number of seats) for many countries; thus figures calculated by different researchers may vary.

Polit02-polit15 embrace 14 nominal and ordinal political variables coded as follows:
Polit02. Type of Regime
(1) Civilian. Any government controlled by a nonmilitary component of the nation’s population.
(2) Military-Civilian. Outwardly civilian government controlled by a military elite. Civilians hold only those posts (up to and including that of Chief of State) for which their services are deemed necessary for successful conduct of government operations. An example would be retention of the Emperor and selected civilian cabinet members during the period of Japanese military hegemony between 1932 and 1945.
(3) Military. Direct rule by the military, usually (but not necessarily) following a military coup d’état. The governing structure may vary from utilization of the military chain of command under conditions of martial law to the institution of an ad hoc administrative hierarchy with at least an upper echelon staffed by military personnel.
(4) Other. All regimes not falling into one or another of the foregoing categories, including instances in which a country, save for reasons of exogenous influence, lacks an effective national government. An example of the latter would be Switzerland between 1815 and 1848.
Polit03. Coups d’État
The number of extraconstitutional or forced changes in the top government elite and/or its effective control of the nation’s power structure in a given year. The term “coup” includes, but is not exhausted by, the term “successful revolution”. Unsuccessful coups are not counted.
Polil04. Major Constitutional Changes
The number of basic alterations in a state’s constitutional structure, the extreme case being the adoption of a new constitution that significantly alters the prerogatives of the various branches of government. Examples of the latter might be the substitution of presidential for parliamentary government or the replacement of monarchical by republican rule. Constitutional amendments which do not have significant impact on the political system are not counted.
Polit05. Head of State
(1) Monarch. Chief of state is a monarch (either hereditary or elective) or a regent functioning on a monarch’s behalf.
(2) President. Chief of state is a president who may function as a chief executive or merely as titular head of state, in which case he will possess little effective power. The presiding officer of a legislative assembly or state council may qualify for the coding, even though the formal title may be that of “chairman”.
(3) Military. A situation in which a member of the nation’s armed forces is recognized as the formal head of government. In case of conflict between (2) and (3), coding is determined on the basis of whether the incumbent’s role is intrinsically military or civilian in character.
(4) Other. This category is generally used when no distinct head of state can be identified; it also includes individuals not included in (1-3), such as theocratic rulers, as well as nonmilitary individuals serving in a collegial capacity.
Polit06. Premier.
(1) Formal executive is premierial, including “Chairman, Council of Ministers”
(2) Formal executive is non-premierial
Polit07. Effective Executive (Type)
Refers to the individual who exercises primary influence in the shaping of most major decisions affecting the nation’s internal and external affairs. The “other” category may refer to a situation in which the individual in question (such as the party first secretary in a Communist regime) holds no formal
governmental post, or to one in which no truly effective national executive can be said to exist.
(1) Monarch
(2) President
(3) Premier
(4) Military
(5) Other
Polit08. Effective Executive (Selection)
(1) Direct Election. Election of the effective executive by popular vote or the election of committed delegates for the purpose of executive selection.
(2) Indirect Election. Selection by an elected assembly or by an elected but uncommitted electoral college. This coding is also used when a legislature is called upon to make the selection in a plurality situation.
(3) Nonselective. Any means of selection not involving a direct or indirect
mandate from an electorate. Polit09. Parliamentary Responsibility Refers to the degree to which a premier must depend on the support of a majority in the lower house of a legislature to remain in office.
(0) Irrelevant. Office of premier or legislature does not exist.
(1) Absent. Office of premier exists, but there is no parliamentary responsibility.
(2) Incomplete. The premier is, at least to some extent, constitutionally responsible to the legislature. Effective responsibility is, however, limited.
(3) Complete. The premier is constitutionally and effectively dependent on a legislative majority for continuance in office.
Polit10. Size of Cabinet (end of year)
Refers to the number of ministers of “cabinet rank”, excluding undersecretaries, parliamentary secretaries, ministerial alternates, etc. Includes the president and vice-president under a presidential system, but not under a parliamentary system.
In many cases, counts are approximate, since sources often differ (particularly in regard to “ministers of state”) as to what constitutes cabinet status.
Generally, the count is of ministries, not of individuals holding multiple offices (the most extreme recent case being that of New Zealand).
Polit11. Major Cabinet Changes
The number of time in a year that a new premier is named and/or 50% of the cabinet posts are assumed by new ministers.
Polit12. Changes in Effective Executive
The number of times in a year that effective control of executive power
changes hands. Such a change requires that the new executive be independent of his predecessor.
Polit13. Legislative Effectiveness
(0) None. No legislature exists.
(1) Ineffective. There are three possible bases for this coding: first, legislative activity may be essentially of a “rubber stamp” character; second, domestic turmoil may make implementation of legislation impossible; third, the effective executive may prevent the legislature from meeting, or otherwise substantially impede the exercise of its functions.
(2) Partially Effective. A situation in which the effective executive’s power substantially outweighs, but does not completely dominate, that of the legislature.
(3) Effective. The possession of significant governmental autonomy by the legislature, typically including substantial authority in regard to taxation and disbursement, and the power to override executive vetoes of legislation.
Polit14. Legislative Selection
(0) None. No legislature exists.
(1) Nonelective. Examples would be the selection of a majority of legislators by the effective executive, or by means of heredity or ascription.
(2) Elective. A majority of legislators (or members of the lower house in a bicameral system) are selected by means of either direct or indirect popular election.
Polit15. Legislative Election
The number of elections held for the lower house of a national legislature in a given year. A limited number of by-elections are included, but most are not.

International Status Indicators

Instat1-instat8 embrace, for the period 1817-1935, eight international status indicators developed by J. David Singer and Melvin Small in “The Composition and Status Ordering of the International System: 1815-1940,” World Politics, 18  (January1966), 236-282. Singer and Small provide entries, in each case, for every fifth year. Yearly estimates were calculated and are provided in the present file for the basic variable, “International Status, Composite Score”, which appears in instat3. The full set of entries is as follows:
Instat1. International Status: Ranking
Instat2. International Status: Case Size
Instat3. International Status: Composite Score
Instat4. International Status: Composite Standardized Score
Instat5. International Status: Quintile
Instat6. International Status: Weighted Rank
Instat7. International Status: Weighted Status Ordering
Instat8. International Status: Weighted Quintile
For a discussion of these data and the coding criteria employed, see Singer and Small, op. cit.
Given structural changes in the international system, the Singer-Small coding criteria became increasingly irrelevant in the late 1930’s and no attempt was made to continue the series beyond 1935.

Computer Indices

Beginning in 1999 a set of computer indices, as follows are reported:
Computer1. Internet Hosts
Computer2. Internet Hosts Per Capita
Computer3. Internet Users
Computer4. Internet Users Per Capita
Computer5. Estimated Personal Computers
Computer6. Estimated Personal Computers Per Capita

Industrial Production

Indprod1 gives electric power production. Insofar as possible, the data include production for both public and private purposes, and cover both thermal and hydroelectric output, thus reflecting total gross generation of electricity, excluding station use and transmission losses. Indprod2 gives the same information in per capita form.
As in the case of data in Energy Production (see above), conversion factors are linked to time spans adopted by the UN Statistical Office. Here there are three such spans: kilowatt hours for 1919-1980, metric tons of coal equivalent for 1981-1993, and metric tons of oil equivalent from 1994. The conversion factors used are .123 from 1000000 kwh to 1000 coal; .086 from 1000000 kwh to 1000 oil; 8.13 from 1000 coal to 1000000 kwh; .700 from 1000 coal to 1000 oil; 1.43 from 1000 oil to 1000 coal; and 11.6 from 1000 oil to 1000000 kwh (see listings in “Codebook.xls”).
Indprod3 contains data on crude steel production, including, insofar as possible, both ingots and steel for castings, whether obtained from pig iron or scrap. Wrought (puddled) iron is generally excluded. Indprod4 gives the same data in per capita form.
Indprod5 contains data on the total production of hydraulic cements used for construction purposes (Portland, metallurgic, aluminous, natural, etc.). Indprod6 gives the same data in per capita form.

Percent Annual Increase Data

All of the fields in these segments contain derived data of a percent annual increase character. Values are calculated only on the basis of entries >0 for consecutive years. A value entered at year Y is (B-A)/A, where A is the original entry for year Y 1 and B is the original entry for year Y. For the items included in this subset, see “Codebook.xls”.

SOURCES AND SOURCE IDENTIFICATION

Cited Sources

About a dozen sources (primarily serial publications) are so extensively utilized that each is referenced by a unique alphabetic tag symbol. All such references are included in the Excel format file, “Bibliography.xls”. For example, a “D” in a tag column means that the datum in question is taken from the UN Demographic Yearbook; an “S” in a tag  column means that the datum is from The Statesman’s Yearbook; etc. For this type of citation, specific page numbers and volume/years are not provided because of the magnitude of the listings that would be required. In the case of serials, however, every effort has been made to utilize the most recent editions, in order to benefit from revisions of earlier information.
In addition to the identification of frequently used general sources, a wide variety of specific citations are given in the file. These sources are listed by Country ID (code), Segment, Field (the original snnfn variable name), and temporal Range (ranges), and are referenced by the following tag symbols:
B – Nonofficial source
G – Official national government source
I – country informant
X – Official national government figure converted to $US
Z – Nonofficial source converted to $US
By way of example, the following entry appears in the file which is sorted by Country ID, Segment, Field and Range:

code tag snnfn variable ranges Source
code B S18F4 electoral2 1919-1971 Stein Rokkan & Jean Meyriat, Eds.,

International Guide to Electoral Statistics. The Hague: Mouton, 1969, p. 45.

The entry tell us that the nonofficial source for country 0061 (Austria), Segment 18, Fields 4 and 6 (variable electoral2), from 19919 through 1971 are taken from page 45 of the volume edited by Rokkan and Meyriat.

Uncited Sources

Two sets of unflagged data are derived from quite limited sources: the event- type Domestic Conflict Event Data are primarily derived from The New York Times, while the International Status Indicators are (except for estimated data in International Status, Composite Score) taken from Singer and Small, op. cit. (The Composite Score uses a “B” tags to distinguish original from estimated entries.)
Finally, most of the Area Data and all of the Legislative Process Data are presented without source tag symbols. For most of the political data specific citations were virtually impossible because of the large number of sources consulted; furthermore, since none contained estimated data, there was no need to distinguish between original and estimated entries.

Cross National Time Series Data Archive User’s Manual – Download [optimized]
Cross National Time Series Data Archive User’s Manual – Download

Read User Manual Online (PDF format)

Read User Manual Online (PDF format)  >>

Download This Manual (PDF format)

Download this manual  >>

Related Manuals