Researchers often wish to carry out additional calculations or analyses using the survival data fromone or more studies of other authors. When it is not possible to obtain the raw data directly,reconstruction techniques provide a valuable alternative. Several authors have proposedmethods/tools for extracting data from such curves using a digitizing software. Instead of using adigitizer to read in the coordinates from a raster image, we propose directly reading in the lines of thePostScript file of a vector image.
Using examples, and a formal error analysis, we illustrate the extent to which, with what accuracyand precision, and in what circumstances, this information can be recovered from the variouselectronic formats in which such curves are published. We focus on the additional precision, andelimination of observer variation, achieved by using vector-based formats rendered by PostScript,rather than the lower resolution image-based formats that have been analyzed up to now. We providesome R code to process these.
If the raster-based images are available, one can reliably recover much of the original informationthat seems to be �hidden� beneath published survival curves. If the original images can be obtained asa PostScript file, the data recovered from it can then be either input into these tools or processeddirectly. We found that the PostScript used by Stata discloses considerably more of the data hiddenbehind survival curves than that generated by other statistical packages.
When it is not possible to obtain the raw data from the authors, reconstruction techniques are avaluable alternative. Compared with previous approaches, one advantage of ours is that there is noobserver variation: there is no need to repeat the digitization process, since the extraction iscompletely replicable.
No comments:
Post a Comment