May 23, 2018

Methodology

Data Sets

State of Existing Data on Ceasefire Violations

There is no systematic official data on ceasefire violations (CFVs) in the public domain in India or Pakistan, let alone data specifying the causes of CFVs and the locations where CFVs take place. CFVs are reported by the respective security forces through set channels to the concerned government departments in India and Pakistan. Aggregate numbers of yearly CFVs are often released by the two governments without indicating the specific locations or when and how they occurred. While, the UNMOGIP reports CFVs to the UN Headquarters, this information is not shared publicly. In any case, the Indian side has not been reporting the CFVs to UNMOGIP since 1972, after the Simla Agreement. Therefore, the UNMOGIP data would only include ‘alleged’ CFVs as reported by Pakistan.

Note on the existing sources of data

All major governmental agencies in India and Pakistan – the Indian MoD, MHA, BSF, Army HQs and Pakistani Inter-Services Public Relations (ISPR) have regularly updated twitter handles which can be resourceful for obtaining information on CFVs. Individual listings and reports on the government websites like are mostly sporadic (site search engines are unable to generate relevant data in an organised manner), unavailable (links are outdated giving the HTTP 404 Not Found errors) or missing (press release is not done for each violation).

How we overcame the lack of data

To overcome this lack of meaningful data, we generated two new datasets (separately, for India and for Pakistan) by listing all CFVs reported in open sources on a daily basis in India and in Pakistan, from 2002 to 2016.
Open sources primarily include print and digital media reports (newspapers, mostly English-language national dailies from India and Pakistan and sometimes even non-Indian and non-Pakistani press reports). Media reports usually include the date, location and in some causes the causes of CFVs and casualties as indicated by the security forces.

However, it should be noted that not all CFVs get reported in the press; since the reporting of the CFVs is bound to be lower than the actual occurrences of violations, the number of CFVs listed in the project’s datasets is significantly lower than the number of CFVs accounted for in the records.

However, what distinguishes our datasets is their ability to provide a chronological record of the CFVs through the years 2002 to 2016, which is very helpful in understanding the patterns, locations and the major causes of CFVs between India and Pakistan. This has far more analytic utility than the cumulative annual data provide by the security forces.

Each dataset consists of six columns: Date of the CFV/date of the CFV reported, their location, location -LOC/IB, reason given for CFV/ trigger event, additional information (including casualties occurred) and the source of the information. CFVs were first tabulated chronologically year-wise from 2002 to 2016. Based on the listings for all the years, subsidiary datasets tabulating the CFVs (reason-wise, location-wise and area-wise), were created. The Indian and the Pakistani datasets were consolidated into an additional dataset to compare the locations, areas and reasons for CFVs reported by each side. The individual and comparative subsidiary datasets were then graphically represented in the form of bar charts and pie charts. The areas of CFVs on the Indian and the Pakistani sides were plotted on Google maps.