我正在寻找疾病和程序的ICD-9代码(医疗代码)的完整列表,格式可以导入数据库并以编程方式引用.我的问题基本上和Looking for resources for ICD-9 codes完全一样,但最初的发帖者忽略了他到底是从哪里"获得"他的完整名单的.

谷歌绝对不是我在这里的朋友,因为我花了很多时间在谷歌上搜索这个问题,找到了很多富文本类型的列表(比如CDC)或网站,我可以通过交互方式向下搜索到完整的列表,但我找不到从何处获得可以填充这些网站并可以解析到数据库中的列表.我相信这里的ftp://ftp.cdc.gov/pub/Health_Statistics/NCHS/Publications/ICD9-CM/2009/个文件都是我想要的,但是这些文件是富文本格式的,包含了很多垃圾和格式,很难准确删除.

我知道这一定是别人做的,我正在努力避免重复别人的工作,但我就是找不到XML/CSV/Excel列表.

推荐答案

删除RTF后,解析文件并将其转换为CSV并不是太难.我得到的包含所有2009年疾病和程序ICD-9代码的解析文件如下:http://www.jacotay.com/files/Disease_and_ProcedureCodes_Parsed.zip 我编写的解析器在这里:http://www.jacotay.com/files/RTFApp.zip 这基本上是一个两步的过程-从CDC FTP站点获取文件,从中删除RTF,然后 Select 无RTF的文件并将其解析为CSV文件. 这里的代码非常粗糙,因为我只需要输出一次结果.

以下是解析应用程序的代码,以防外部链接关闭(后端到一个允许您 Select 文件名并单击按钮的表单)

Public Class Form1

Private Sub btnBrowse_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles btnBrowse.Click
    Dim p As New OpenFileDialog With {.CheckFileExists = True, .Multiselect = False}
    Dim pResult = p.ShowDialog()
    If pResult = Windows.Forms.DialogResult.Cancel OrElse pResult = Windows.Forms.DialogResult.Abort Then
        Exit Sub
    End If
    txtFileName.Text = p.FileName
End Sub

Private Sub btnGo_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles btnGo.Click
    Dim pFile = New IO.FileInfo(txtFileName.Text)
    Dim FileText = IO.File.ReadAllText(pFile.FullName)
    FileText = RemoveRTF(FileText)
    IO.File.WriteAllText(Replace(pFile.FullName, pFile.Extension, "_fixed" & pFile.Extension), FileText)

End Sub


Function RemoveRTF(ByVal rtfText As String)
    Dim rtBox As System.Windows.Forms.RichTextBox = New System.Windows.Forms.RichTextBox

    '// Get the contents of the RTF file. Note that when it is
    '// stored in the string, it is encoded as UTF-16.
    rtBox.Rtf = rtfText
    Dim plainText = rtBox.Text

    Return plainText
End Function


Private Sub Button1_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles Button1.Click
    Dim pFile = New IO.FileInfo(txtFileName.Text)
    Dim FileText = IO.File.ReadAllText(pFile.FullName)
    Dim DestFileLine As String = ""
    Dim DestFileText As New System.Text.StringBuilder

    'Need to parse at lines with numbers, lines with all caps are thrown away until next number
    FileText = Strings.Replace(FileText, vbCr, "")
    Dim pFileLines = FileText.Split(vbLf)
    Dim CurCode As String = ""
    For Each pLine In pFileLines
        If pLine.Length = 0 Then
            Continue For
        End If
        pLine = pLine.Replace(ChrW(9), " ")
        pLine = pLine.Trim

        Dim NonCodeLine As Boolean = False
        If IsNumeric(pLine.Substring(0, 1)) OrElse (pLine.Length > 3 AndAlso (pLine.Substring(0, 1) = "E" OrElse pLine.Substring(0, 1) = "V") AndAlso IsNumeric(pLine.Substring(1, 1))) Then
            Dim SpacePos As Int32
            SpacePos = InStr(pLine, " ")
            Dim NewCode As String
            NewCode = ""
            If SpacePos >= 3 Then
                NewCode = Strings.Left(pLine, SpacePos - 1)
            End If

            If SpacePos < 3 OrElse Strings.Mid(pLine, SpacePos - 1, 1) = "." OrElse InStr(NewCode, "-") > 0 Then
                NonCodeLine = True
            Else
                If CurCode <> "" Then
                    DestFileLine = Strings.Replace(DestFileLine, ",", "&#44;")
                    DestFileLine = Strings.Replace(DestFileLine, """", "&quot;").Trim
                    DestFileText.AppendLine(CurCode & ",""" & DestFileLine & """")
                    CurCode = ""
                    DestFileLine = ""
                End If

                CurCode = NewCode
                DestFileLine = Strings.Mid(pLine, SpacePos + 1)
            End If
        Else
            NonCodeLine = True
        End If


        If NonCodeLine = True AndAlso CurCode <> "" Then 'If we are not on a code keep going, otherwise check it
            Dim pReg As New System.Text.RegularExpressions.Regex("[a-z]")
            Dim pRegCaps As New System.Text.RegularExpressions.Regex("[A-Z]")
            If pReg.IsMatch(pLine) OrElse pLine.Length <= 5 OrElse pRegCaps.IsMatch(pLine) = False OrElse (Strings.Left(pLine, 3) = "NOS" OrElse Strings.Left(pLine, 2) = "IQ") Then
                DestFileLine &= " " & pLine
            Else 'Is all caps word
                DestFileLine = Strings.Replace(DestFileLine, ",", "&#44;")
                DestFileLine = Strings.Replace(DestFileLine, """", "&quot;").Trim
                DestFileText.AppendLine(CurCode & ",""" & DestFileLine & """")
                CurCode = ""
                DestFileLine = ""
            End If
        End If
    Next

    If CurCode <> "" Then
        DestFileLine = Strings.Replace(DestFileLine, ",", "&#44;")
        DestFileLine = Strings.Replace(DestFileLine, """", "&quot;").Trim
        DestFileText.AppendLine(CurCode & ",""" & DestFileLine & """")
        CurCode = ""
        DestFileLine = ""
    End If

    IO.File.WriteAllText(Replace(pFile.FullName, pFile.Extension, "_parsed" & pFile.Extension), DestFileText.ToString)
End Sub

结束类

Database相关问答推荐

如何将 Scylla DB 中的计数器列重置为零?

MySQL 语法 LIMIT x, y 的 T-SQL 类似的功能是什么?

只用一个 save() 插入多行

哪个本地数据库适合 Windows 8 应用store 应用?

Java中基于文件的数据库

按请求的可变事务隔离级别

是否可以使用 Mongo 的 Object ID作为其唯一标识符?如果是这样,如何将其转换为字符串并按字符串查找?

db:schema:load vs db:migrate with capistrano

如何将视图的所有权限授予任意用户

不同的数据库是否使用不同的名称引用?

处理hibernate entities上的数据库视图的优雅方法?

在数据库字段中存储数字数组

如何从 PostgreSQL 数据库中的文本文件加载数据?

维护 mgo 会话的最佳实践

ORM 还是Vietnam of Computer Science吗?

如何更正此 sql 连接上的相关名称?

sqlite 表中的最大行数

如何在 SQL Server 中删除多个数据库

Rails 控制台 - 查找在某天创建的位置

从 SQLAlchemy 中的文件执行 SQL