[hit-forum] Fw: FYI Very Interesting -- on Semantic; Codification / Term

To:	hit-forum@xxxxxxxxxxxxxx
From:	Niemann.Brand@xxxxxxxxxxxxxxx
Date:	Sat, 19 Aug 2006 10:13:35 -0400
Message-id:	<OF3B98F54F.0247A164-ON852571CF.004E25FC-852571CF.004E261F@xxxxxxxxxxxxxxx>

Yes and for the HIT Forum. Brand

-----Forwarded by Brand Niemann/DC/USEPA/US on 08/19/2006 10:08AM -----

To: Brand Niemann/DC/USEPA/US@EPA
From: marc.wine@xxxxxxx
Date: 08/17/2006 04:11PM
Subject: Fw: FYI Very Interesting -- on Semantic; Codification / Terminology Comment for HITSP BIO IS

Brand - Very interesting --
----- Forwarded by Marc R. Wine/XCI/CO/GSA/GOV on 08/17/2006 04:12 PM -----

"Elkin, Peter L. M.D." <Elkin.Peter@xxxxxxxx>
08/17/2006 01:45 PM
Please respond to
"Elkin, Peter L. M.D." <Elkin.Peter@xxxxxxxx>

To
TC-BIO@xxxxxxxxxxxxxxxxx
cc

Subject
Re: Codification / Terminology Comment for HITSP BIO IS

Dear Pat,
While I believe that you are quite correct that the list of reportable conditions with the addition of Synonymy which to date does not exist would not require a coding service. However early detection relies on finding signals within the free text content of Chief Complaints, Nursing notes (within the AHIC use case and data set) as well as organisms and both lab and radiology results. The variance in use of language can lead to a signal confounded by missed synonymy and inappropriate signaling from homonyms. As we need to be prepared to recognize and respond to subtle signs so that a public health response can be mobilized before a threat is obvious to all we need the highest quality data that can be obtained from these free text data sources. Codification normalizes variability. The specificity of retrieval in the best of circumstances from free text queries is ~33% (our data) and with coded chief complaints our specificity was ~98%. There is a huge difference in the reliability of this type of data.
What I am advocating is to make AHIC and ONC aware of the importance of this issue, to keep it in scope for the national HITSP agenda and to specify it as a component of the interoperability specification with the simple input and output characteristics which we have specified for a first cut coding service. This will halt the, in my opinion, rather naive effort to push the requirement to perform coding to healthcare organizations where it will be done likely in different non-standard ways across organizations at a very high cost. I believe that the country needs to step up to this challange as part of our national biosurveillance agenda.
Pat I want to personally thank and commend you for your dedication and thoughtfulness in your role on the HITSP Biosurveillance Technical Committee. You have been tireless in your provision of high quaility information and although we may disagree on this point I have found your comments to be intelligent and helpful as we have moved to both select standards and author the near due interoperability specification.
With warm regards,
Peter
Peter L. Elkin, MD
Co-Chair, HITSP Biosurveillance Technical Committee
Professor of Medicine
Mayo Clinic College of Medicine

-----Original Message-----
From: Gibbons, Patricia S.
To: Hayward, Mark J.; Elkin, Peter L. M.D.
Sent: 8/17/2006 11:18 AM
Subject: RE: Codification / Terminology Comment for HITSP BIO IS
Mark - You are not on the distribution list, so you do not see other
participants' contributions. At any rate, as with all communications on
these lists, the intention is to move to understanding and consensus
through professional dialogue on important issues. Indeed this
particular dialogue has led (just today) to our group's finding out
about a national ontology effort through SiCOP and has broached the
important general question relating to problems introduced by the use of
a single use case. Pat

_____
From: Hayward, Mark J.
Sent: Thursday, August 17, 2006 8:40 AM
To: Gibbons, Patricia S.; Elkin, Peter L. M.D.
Subject: RE: Codification / Terminology Comment for HITSP BIO IS

Very interesting dialogue and debate, but it may be best to continue the
debate just between the two of you and not through a distribution list.

_____
From: Gibbons, Patricia S.
Sent: Wednesday, August 16, 2006 8:25 PM
To: 'Elkin, Peter L. M.D.'; TC-BIO@xxxxxxxxxxxxxxxxx
Cc: Hayward, Mark J.
Subject: RE: Codification / Terminology Comment for HITSP BIO IS
Dear Peter - My concerns arise not from a belief that text processing is
not a promisingly semantic technology, as from more practical concerns
having to the with this particular use case, for which frankly, the
semantic requirements are quite simple: for a given (finite and small)
list of conditions, specify which is the subject of this particular
report (from reporting entity to coordinating entities). We are, I
believe, dealing with a list of (generously) 200 or so conditions. Let
us not forget that these reports are useless until aggregated at the
level of the local, State, or National level. Indeed, the use case we
are describing does not concern itself with local peer-to-peer reporting
(unfortunately, in my view). In such a case, the sublime subtlety
allowed by SNOMED seems like overkill. This is an alerting and
statistical function, and the simpler the semantics in this instance, I
think, the better.

As you might guess, I feel somewhat like a heretic typing those words,
for my work to bring our group and others an awareness of the importance
of semantic interoperability is familiar at least to some. Originally,
it was my sense that our original set of use cases seemed selected to
bypass semantics to the greatest extent possible, but that was clearly
impossible in this case. I do believe, however, that semantic subtlety
is not necessary or helpful here. What is needed is a "plain list" of
covered conditions that even my EMT son in Podunk Montana can
understand. Let them focus on criteria (plainly laid out) and rapid
reporting (whichever way is fastest).

Deriving diagnostic and case descriptive meaning from text - I remain to
be convinced on this: -- is best suited when there is only text. The
text accompanying a pigeon-holed diagnosis will always be a rich source
of nascent knowledge for research and clinical use, but is not
computable, cross-codable or easily translated into actionable
interventions. What our use case is about is, indeed, actionable
intervention - at all layers of the public health infrastructure.

One cannot argue, I suppose, against a "placeholder" for future
functionality; but to me this seems among the least apparent of
applications within which to insert such a placehholder. To me, this is
very much an "if A do B" type action-oriented application, for which
semantic clarity (yes, even to the extent of forcing distinction and
classification) is required, as it is the analysis and subsequent action
which are most important. In the HL7 interoperability white paper, this
issue is somewhat addressed: that in situations of urgency: clarity and
priority are more important than subtlety and completeness. That is the
plain reality of the "emergency" or "catastrophe." As to whether this
might result in errors, I offer two counterarguments - one is the "law
of large numbers" which deals with the fact that, when a deluge of
information occurs, the most commonly occurring (e.g. diagnosis of a
list of 200) becomes readily apparent. Secondly is the clustering of
diagnoses (quite opaque to raw text); e.g. FLI will occur along with
H5N1.

I believe, in this particular use case, our focus on semantics should be
concerned with aggregation and subsequent communication, alerting, and
action. We have not begun to address this part of the use case. I'm
hoping someone is....

Best wishes.

Pat

_____
From: Elkin, Peter L. M.D. [ mailto:Elkin.Peter@xxxxxxxx ]
Sent: Wednesday, August 16, 2006 6:25 PM
To: 'Gibbons, Patricia S.'; TC-BIO@xxxxxxxxxxxxxxxxx
Subject: RE: Codification / Terminology Comment for HITSP BIO IS

Dear Pat,

I think your skepticism regarding Text Processing is not warranted. I
reference the following:

Elkin
< http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&cmd=Retrieve&do
pt=AbstractPlus&list_uids=16770974&query_hl=2&itool=pubmed_docsum> PL,
Brown SH, Husser CS, Bauer BA, Wahner-Roedler D, Rosenbloom ST, Speroff
T.

< http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&cmd=Display&dop
t=pubmed_pubmed&from_uid=16770974> Related Articles,
<_javascript_:PopUpMenu2_Set(Menu16770974);> Links

< http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&cmd=Retrieve&do
pt=AbstractPlus&list_uids=16770974&itool=iconabstr&query_hl=2&itool=pubm
ed_docsum> Abstract
Evaluation of the content coverage of SNOMED CT: ability of SNOMED
clinical terms to represent clinical problem lists.
Mayo Clin Proc. 2006 Jun;81(6):741-8.
PMID: 16770974 [PubMed - indexed for MEDLINE]

There are many other groups beside ours that have done exemplary work in
this area including Wendy Chapman and Pittsburg, Carol Freedman at
Columbia and Robert Baud from Geneva.

By specifying the coding service, meaning the inputs and outputs for the
service need to be specified rather than how to perform NLP, we put a
place holder in the interoperability specification to which future work
can be attached. This is of paramount importance if we are to be
successful as accurate surveillance requires normalization of text to
concepts of interest (diagnoses and case definitions). We must not be
shy in our search for quality.

Warm regards,

Peter

Peter L. Elkin, MD, FACP
Professor of Medicine
Department of Internal Medicine
Mayo Clinic College of Medicine
(507) 284-1551
Fax: (507) 284-5370

_____
From: owner-tc-bio@xxxxxxxxxxxxxxxxx
[ mailto:owner-tc-bio@xxxxxxxxxxxxxxxxx ] On Behalf Of Gibbons, Patricia
S.
Sent: Wednesday, August 16, 2006 2:36 PM
To: TC-BIO@xxxxxxxxxxxxxxxxx
Subject: Re: Codification / Terminology Comment for HITSP BIO IS
It is unclear that coding from free text will ever be preferable in
situations in which coded values are available. We have a number of
initiatives at Mayo in this area, in which various approaches have been
taken. Even the most successful of these require triage in terms of the
likelihood that an encoded entity (from natural language) is "good
enough." Those instances in which the likelihood is considered to not
meet an established level of certainty, resolution by a trained coder is
required. Though such methods can be expected to improve over time,
they are not yet mature, and will be, for the foreseeable future, based
more on statistics than on computable logic. In addition, for NLP to be
effective, great control over the context of the data (word strings)
must be enforced; e.g. "reason for visit" cannot be answered by "brought
in by husband." -- Indeed, the degree of control relinquished from the
content of the data field must be instead asserted at the contextual
level. This is even more difficult than getting people to look up an
option on a list or in a code book.

Two other elements complicate our charge. One is the huge geographical
and socioeconomic reach of the planned application, meaning that
training will be very difficult to assure on a uniform basis; it will be
essential to provide system instructions that are as basic and easy to
follow as possible.

Second, although I have heard there has been important process in this
particular area -- and it's criticality has been noted by AHIC --:
there has been no single list of reportable conditions that has been
used nationwide for public health reporting. Establishing such a list
would seem to be a first priority, given its criticality for situational
awareness. Even when such a list is available, requiring its use will
demand specific support at the level of the Secretary and determination
by the CDC to ensure its adoption.

Finally, this application is far too critical to the health and security
of the nation as a whole to not include "robustness" as a critical
characteristic for its success. Let's not forget, we have no immediate
plans for all jurisdictions to be connected electronically in the early
months (or years!) of this foundational initiative.

Pat

-----Original Message-----
From: owner-tc-bio@xxxxxxxxxxxxxxxxx
[ mailto:owner-tc-bio@xxxxxxxxxxxxxxxxx ] On Behalf Of Shaun Grannis
Sent: Wednesday, August 16, 2006 11:10 AM
To: TC-BIO@xxxxxxxxxxxxxxxxx
Subject: Re: Codification / Terminology Comment for HITSP BIO IS

Peter,

Your points are very well taken, and I think we all share the goal of
greater semantic interoperability. The challenge lies in crafting a
strategy to achieve the goal. The workgroup recommends deferring the
coding service for a few reasons:

First, the workgroup feels that current technology, processes, and
standards to implement automated codification of free form text are
either nascent, poorly specified, or non-existent. Second, even if the
technology were more mature, documenting the immensely complex process
of free form text codification cannot adequately be addressed to the
appropriate level of specificity over the next few days.

This is clearly a crucial activity and requires high-priority attention
over the near term. We strongly advocate for further research and
dissmentiation of best practices in this area. The workgroup revised the
draft statement (sent yesterday by Floyd) addressing your concerns, and
Anna/Lori will be sending out the revised text for your review.

Warmest Regards,

Shaun

--
Shaun J. Grannis, MD MS
Research Scientist, Regenstrief Institute
Department of Family Medicine
Indiana University School of Medicine
Voice: (317) 630-7494, Fax: (317) 630-6962

Elkin, Peter L. M.D. wrote:
> Dear Floyd,
>
> I believe that relegating coding to the edge systems virtually
> guarantees that it will never happen. This is likely the most
> important core activity of biosurveillance as the accuracy of these
> encodings provide the greatest measure of reliability of the
> downstream analyses. If the signals are flawed the conclusions from
> those signals will be unreliable. We must make a stand for truth and
> beauty. An interoperability specification is only viable if it
> provides interoperability. Without coding there is no semantic
> interoperability. It is my strong belief that we can and must provide
> a believable specification if this work is not to be totally ignored.
> By relegating coding to edge systems we demonstrate a clear
> misunderstanding of the capabilities of the current hospital
> information systems. This disconnect is so fundamental as to show
> most readers that we have not done our homework.
>
> I ask that the TC reconsider keeping the coding service in scope as it
> is necessary for the filtering activity described in the AHIC
> Biosurveillance use case.
>
> Thank you kindly,
>
> Peter
>
>
> Peter L. Elkin, MD
> Professor of Medicine
> Mayo Clinic College of Medicine
> (507) 284-1551
> Fax: (507) 284-5370
>
>
>
------------------------------------------------------------------------
> *From:* Eisenberg, Floyd (MED US) [ mailto:floyd.eisenberg@xxxxxxxxxxx ]
> *Sent:* Tuesday, August 15, 2006 5:02 PM
> *To:* Peter Elkin M.D. (E-mail)
> *Cc:* Lori Fourquet (E-mail); Anna Orlova (E-mail); Shaun Grannis
> (E-mail); David O. Dobbs (E-mail); Edward Barthell (E-mail)
> *Subject:* Codification / Terminology Comment for HITSP BIO IS
>
> Peter,
>
> Thank you for all of your work on the document.
>
> After discussion in the BIO TC on the coding component today it was
> determined that a separate component document would be confusing.
> There is agreement on a clear need to address semantic
> interoperability and, therefore, the attached file lists a draft
> comment to add to the overall Interoperability Specification. Please
> review and comment. We look forward to your edits.
>
> Best Regards,
>
> Floyd
>
>
> *Floyd P. Eisenberg, MD MPH**
> **SIEMENS** **Medical Solutions* *Health Services*
>
> *Clinical Informatics - Mailcode A17*
> *51 Valley Stream Parkway, Malvern, PA, 19355-1406*
> *Phone: +01 610 219 8547 Fax: +01 610 219 6518*
> *Cellular +01 215 290-6563*
> *E-Mail:** **Floyd.Eisenberg@xxxxxxxxxxx <
> mailto:Floyd.Eisenberg@xxxxxxxxxxx >
> < mailto:Floyd.Eisenberg@xxxxxxxxxxx%3E >*
>
>
<<image001.gif>>


_________________________________________________________________
Message Archives: http://colab.cim3.net/forum/hit-forum/
Subscribe/Unsubscribe/Config: http://colab.cim3.net/mailman/listinfo/hit-forum/
Shared Files: http://colab.cim3.net/file/work/hit/
Community Wiki: http://colab.cim3.net/wiki/
Community Portal: http://colab.cim3.net/
To Post: mailto:hit-forum@xxxxxxxxxxxxxx    (01)

<Prev in Thread]	Current Thread	[Next in Thread>
[hit-forum] Fw: FYI Very Interesting -- on Semantic; Codification / Terminology Comment for HITSP BIO IS, Niemann . Brand <=

Previous by Date:	[hit-forum] Re HITOP Pilot, Niemann . Brand
Next by Date:	[hit-forum] Out of Office, marc . wine
Previous by Thread:	[hit-forum] Re HITOP Pilot, Niemann . Brand
Next by Thread:	[hit-forum] Out of Office, marc . wine
Indexes:	[Date] [Thread] [Top] [All Lists]