hit-forum
[Top] [All Lists]

[hit-forum] Fw: FYI Very Interesting -- on Semantic; Codification / Term

To: hit-forum@xxxxxxxxxxxxxx
From: Niemann.Brand@xxxxxxxxxxxxxxx
Date: Sat, 19 Aug 2006 10:13:35 -0400
Message-id: <OF3B98F54F.0247A164-ON852571CF.004E25FC-852571CF.004E261F@xxxxxxxxxxxxxxx>
Yes and for the HIT Forum. Brand
-----Forwarded by Brand Niemann/DC/USEPA/US on 08/19/2006 10:08AM -----

To: Brand Niemann/DC/USEPA/US@EPA
From: marc.wine@xxxxxxx
Date: 08/17/2006 04:11PM
Subject: Fw: FYI Very Interesting -- on Semantic; Codification / Terminology Comment for HITSP BIO IS


Brand - Very interesting --
----- Forwarded by Marc R. Wine/XCI/CO/GSA/GOV on 08/17/2006 04:12 PM -----

"Elkin, Peter L. M.D." <Elkin.Peter@xxxxxxxx>
08/17/2006 01:45 PM
Please respond to
"Elkin, Peter L. M.D." <Elkin.Peter@xxxxxxxx>

To
TC-BIO@xxxxxxxxxxxxxxxxx
cc
 
Subject
Re: Codification / Terminology Comment for HITSP BIO IS

   




 Dear Pat,
While I believe that you are quite correct that the list of reportable conditions with the addition of Synonymy which to date does not exist would not require a coding service.  However early detection relies on finding signals within the free text content of Chief Complaints, Nursing notes (within the AHIC use case and data set) as well as organisms and both lab and radiology results.  The variance in use of language can lead to a signal confounded by missed synonymy and inappropriate signaling from homonyms.  As we need to be prepared to recognize and respond to subtle signs so that a public health response can be mobilized before a threat is obvious to all we need the highest quality data that can be obtained from these free text data sources.  Codification normalizes variability.  The specificity of retrieval in the best of circumstances from free text queries is ~33% (our data) and with coded chief complaints our specificity was ~98%.  There is a huge difference in the reliability of this type of data.
What I am advocating is to make AHIC and ONC aware of the importance of this issue, to keep it in scope for the national HITSP agenda and to specify it as a component of the interoperability specification with the simple input and output characteristics which we have specified for a first cut coding service.  This will halt the, in my opinion, rather naive effort to push the requirement to perform coding to healthcare organizations where it will be done likely in different non-standard ways across organizations at a very high cost.  I believe that the country needs to step up to this challange as part of our national biosurveillance agenda.
Pat I want to personally thank and commend you for your dedication and thoughtfulness in your role on the HITSP Biosurveillance Technical Committee.  You have been tireless in your provision of high quaility information and although we may disagree on this point I have found your comments to be intelligent and helpful as we have moved to both select standards and author the near due interoperability specification.
With warm regards,
Peter
Peter L. Elkin, MD
Co-Chair, HITSP Biosurveillance Technical Committee

Professor of Medicine

Mayo Clinic College of Medicine



 
-----Original Message-----
From: Gibbons, Patricia S.

To: Hayward, Mark J.; Elkin, Peter L.   M.D.

Sent: 8/17/2006 11:18 AM

Subject: RE: Codification / Terminology Comment for HITSP BIO IS

Mark - You are not on the distribution list, so you do not see other
participants' contributions.  At any rate, as with all communications on

these lists, the intention is to move to understanding and consensus

through professional dialogue on important issues.  Indeed this

particular dialogue has led (just today) to our group's finding out

about a national ontology effort through SiCOP and has broached the

important general question relating to problems introduced by the use of

a single use case.  Pat

 
  _____  
From: Hayward, Mark J.
Sent: Thursday, August 17, 2006 8:40 AM

To: Gibbons, Patricia S.; Elkin, Peter L. M.D.

Subject: RE: Codification / Terminology Comment for HITSP BIO IS

 
Very interesting dialogue and debate, but it may be best to continue the
debate just between the two of you and not through a distribution list.

 
  _____  
From: Gibbons, Patricia S.
Sent: Wednesday, August 16, 2006 8:25 PM

To: 'Elkin, Peter L. M.D.'; TC-BIO@xxxxxxxxxxxxxxxxx

Cc: Hayward, Mark J.

Subject: RE: Codification / Terminology Comment for HITSP BIO IS

Dear Peter - My concerns arise not from a belief that text processing is
not a promisingly semantic technology, as from more practical concerns

having to the with this particular use case, for which frankly, the

semantic requirements are quite simple:  for a given (finite and small)

list of conditions, specify which is the subject of this particular

report (from reporting entity to coordinating entities).  We are, I

believe, dealing with a list of (generously) 200 or so conditions.  Let

us not forget that these reports are useless until aggregated at the

level of the local, State, or National level.  Indeed, the use case we

are describing does not concern itself with local peer-to-peer reporting

(unfortunately, in my view).  In such a case, the sublime subtlety

allowed by SNOMED seems like overkill.  This is an alerting and

statistical function, and the simpler the semantics in this instance, I

think, the better.

 
As you might guess, I feel somewhat like a heretic typing those words,
for my work to bring our group and others an awareness of the importance

of semantic interoperability is familiar at least to some.  Originally,

it was my sense that our original set of use cases seemed selected to

bypass semantics to the greatest extent possible, but that was clearly

impossible in this case.  I do believe, however, that semantic subtlety

is not necessary or helpful here.  What is needed is a "plain list" of

covered conditions that even my EMT son in Podunk Montana can

understand.  Let them focus on criteria (plainly laid out) and rapid

reporting (whichever way is fastest).  

 
Deriving diagnostic and case descriptive meaning from text - I remain to
be convinced on this: -- is best suited when there is only text.  The

text accompanying a pigeon-holed diagnosis will always be a rich source

of nascent knowledge for research and clinical use, but is not

computable, cross-codable or easily translated into actionable

interventions.  What our use case is about is, indeed, actionable

intervention - at all layers of the public health infrastructure.

 
One cannot argue, I suppose, against a "placeholder" for future
functionality; but to me this seems among the least apparent of

applications within which to insert such a placehholder.  To me, this is

very much an "if A do B" type action-oriented application, for which

semantic clarity (yes, even to the extent of forcing distinction and

classification) is required, as it is the analysis and subsequent action

which are most important.  In the HL7 interoperability white paper, this

issue is somewhat addressed: that in situations of urgency: clarity and

priority are more important than subtlety and completeness.  That is the

plain reality of the "emergency" or "catastrophe."  As to whether this

might result in errors, I offer two counterarguments - one is the "law

of large numbers" which deals with the fact that, when a deluge of

information occurs, the most commonly occurring (e.g. diagnosis of a

list of 200) becomes readily apparent.  Secondly is the clustering of

diagnoses (quite opaque to raw text); e.g. FLI will occur along with

H5N1.  

 
I believe, in this particular use case, our focus on semantics should be
concerned with aggregation and subsequent communication, alerting, and

action.  We have not begun to address this part of the use case.  I'm

hoping someone is....

 
Best wishes.
 
Pat
 
 
  _____  
From: Elkin, Peter L. M.D. [ mailto:Elkin.Peter@xxxxxxxx ]
Sent: Wednesday, August 16, 2006 6:25 PM

To: 'Gibbons, Patricia S.'; TC-BIO@xxxxxxxxxxxxxxxxx

Subject: RE: Codification / Terminology Comment for HITSP BIO IS

 
Dear Pat,
 
I think your skepticism regarding Text Processing is not warranted.  I
reference the following:

 

 
Elkin
<
http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&cmd=Retrieve&do
pt=AbstractPlus&list_uids=16770974&query_hl=2&itool=pubmed_docsum>  PL,

Brown SH, Husser CS, Bauer BA, Wahner-Roedler D, Rosenbloom ST, Speroff

T.

 
<
http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&cmd=Display&dop
t=pubmed_pubmed&from_uid=16770974> Related Articles,

<_javascript_:PopUpMenu2_Set(Menu16770974);> Links


 
<
http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&cmd=Retrieve&do
pt=AbstractPlus&list_uids=16770974&itool=iconabstr&query_hl=2&itool=pubm

ed_docsum> Abstract

Evaluation of the content coverage of SNOMED CT: ability of SNOMED
clinical terms to represent clinical problem lists.

Mayo Clin Proc. 2006 Jun;81(6):741-8.
PMID: 16770974 [PubMed - indexed for MEDLINE]

 
There are many other groups beside ours that have done exemplary work in
this area including Wendy Chapman and Pittsburg, Carol Freedman at

Columbia and Robert Baud from Geneva.

 
By specifying the coding service, meaning the inputs and outputs for the
service need to be specified rather than how to perform NLP, we put a

place holder in the interoperability specification to which future work

can be attached.  This is of paramount importance if we are to be

successful as accurate surveillance requires normalization of text to

concepts of interest (diagnoses and case definitions).  We must not be

shy in our search for quality.

 
Warm regards,
 
Peter
 
Peter L. Elkin, MD, FACP
Professor of Medicine
Department of Internal Medicine
Mayo Clinic College of Medicine
(507) 284-1551
Fax: (507) 284-5370

 
  _____  
From: owner-tc-bio@xxxxxxxxxxxxxxxxx
[
mailto:owner-tc-bio@xxxxxxxxxxxxxxxxx ] On Behalf Of Gibbons, Patricia
S.

Sent: Wednesday, August 16, 2006 2:36 PM

To: TC-BIO@xxxxxxxxxxxxxxxxx

Subject: Re: Codification / Terminology Comment for HITSP BIO IS

It is unclear that coding from free text will ever be preferable in
situations in which coded values are available.  We have a number of

initiatives at Mayo in this area, in which various approaches have been

taken.  Even the most successful of these require triage in terms of the

likelihood that an encoded entity (from natural language) is "good

enough."  Those instances in which the likelihood is considered to not

meet an established level of certainty, resolution by a trained coder is

required.  Though such methods can be expected to improve over time,

they are not yet mature, and will be, for the foreseeable future, based

more on statistics than on computable logic.  In addition, for NLP to be

effective, great control over the context of the data (word strings)

must be enforced; e.g. "reason for visit" cannot be answered by "brought

in by husband." -- Indeed, the degree of control relinquished from the

content of the data field must be instead asserted at the contextual

level.  This is even more difficult than getting people to look up an

option on a list or in a code book.

 
Two other elements complicate our charge.  One is the huge geographical
and socioeconomic reach of the planned application, meaning that

training will be very difficult to assure on a uniform basis; it will be

essential to provide system instructions that are as basic and easy to

follow as possible.  

 
Second, although I have heard there has been important process in this
particular area -- and it's criticality has been noted by AHIC --:

there has been no single list of reportable conditions that has been

used nationwide for public health reporting.  Establishing such a list

would seem to be a first priority, given its criticality for situational

awareness. Even when such a list is available, requiring its use will

demand specific support at the level of the Secretary and determination

by the CDC to ensure its adoption.

 
Finally, this application is far too critical to the health and security
of the nation as a whole to not include "robustness" as a critical

characteristic for its success.  Let's not forget, we have no immediate

plans for all jurisdictions to be connected electronically in the early

months (or years!) of this foundational initiative.

 
Pat
 
 
-----Original Message-----
From: owner-tc-bio@xxxxxxxxxxxxxxxxx

[
mailto:owner-tc-bio@xxxxxxxxxxxxxxxxx ] On Behalf Of Shaun Grannis
Sent: Wednesday, August 16, 2006 11:10 AM

To: TC-BIO@xxxxxxxxxxxxxxxxx

Subject: Re: Codification / Terminology Comment for HITSP BIO IS

 
Peter,
 
Your points are very well taken, and I think we all share the goal of
greater semantic interoperability. The challenge lies in  crafting a
strategy to achieve the goal. The workgroup recommends deferring the
coding service for a few reasons:
 
First, the workgroup feels that current technology, processes, and
standards to implement automated codification of free form text are
either nascent, poorly specified, or non-existent. Second, even if the
technology were more mature, documenting the immensely complex process
of free form text codification cannot adequately be addressed to the
appropriate level of specificity over the next few days.
 
This is clearly a crucial activity and requires high-priority attention
over the near term. We strongly advocate for further research and
dissmentiation of best practices in this area. The workgroup revised the
draft statement (sent yesterday by Floyd) addressing your concerns, and
Anna/Lori will be sending out the revised text for your review.
 
Warmest Regards,
 
Shaun
 
--
Shaun J. Grannis, MD MS
Research Scientist, Regenstrief Institute
Department of Family Medicine
Indiana University School of Medicine
Voice: (317) 630-7494, Fax: (317) 630-6962
 
 
 
 
Elkin, Peter L. M.D. wrote:
> Dear Floyd,
>
> I believe that relegating coding to the edge systems virtually
> guarantees that it will never happen.  This is likely the most
> important core activity of biosurveillance as the accuracy of these
> encodings provide the greatest measure of reliability of the
> downstream analyses.  If the signals are flawed the conclusions from
> those signals will be unreliable.  We must make a stand for truth and
> beauty.  An interoperability specification is only viable if it
> provides interoperability.  Without coding there is no semantic
> interoperability.  It is my strong belief that we can and must provide
> a believable specification if this work is not to be totally ignored.
> By relegating coding to edge systems we demonstrate a clear
> misunderstanding of the capabilities of the current hospital
> information systems.  This disconnect is so fundamental as to show
> most readers that we have not done our homework.
>
> I ask that the TC reconsider keeping the coding service in scope as it
> is necessary for the filtering activity described in the AHIC
> Biosurveillance use case.
>
> Thank you kindly,
>
> Peter
>
>
> Peter L. Elkin, MD
> Professor of Medicine
> Mayo Clinic College of Medicine
> (507) 284-1551
> Fax: (507) 284-5370
>
>
>
------------------------------------------------------------------------

> *From:* Eisenberg, Floyd (MED US) [ mailto:floyd.eisenberg@xxxxxxxxxxx ]
> *Sent:* Tuesday, August 15, 2006 5:02 PM
> *To:* Peter Elkin M.D. (E-mail)
> *Cc:* Lori Fourquet (E-mail); Anna Orlova (E-mail); Shaun Grannis
> (E-mail); David O. Dobbs (E-mail); Edward Barthell (E-mail)
> *Subject:* Codification / Terminology Comment for HITSP BIO IS
>
> Peter,
>
> Thank you for all of your work on the document.
>
> After discussion in the BIO TC on the coding component today it was
> determined that a separate component document would be confusing.
> There is agreement on a clear need to address semantic
> interoperability and, therefore, the attached file lists a draft
> comment to add to the overall Interoperability Specification.  Please
> review and comment.  We look forward to your edits.
>
> Best Regards,
>
> Floyd
>
>
> *Floyd P. Eisenberg, MD MPH**
> **SIEMENS** **Medical Solutions* *Health Services*
>
> *Clinical Informatics - Mailcode A17*
> *51 Valley Stream Parkway, Malvern, PA, 19355-1406*
> *Phone:  +01 610 219 8547     Fax: +01 610 219 6518*
> *Cellular +01 215 290-6563*
> *E-Mail:** **Floyd.Eisenberg@xxxxxxxxxxx <
> mailto:Floyd.Eisenberg@xxxxxxxxxxx >
> < mailto:Floyd.Eisenberg@xxxxxxxxxxx%3E >*
>
>
 <<image001.gif>>


_________________________________________________________________
Message Archives: http://colab.cim3.net/forum/hit-forum/
Subscribe/Unsubscribe/Config: http://colab.cim3.net/mailman/listinfo/hit-forum/
Shared Files: http://colab.cim3.net/file/work/hit/
Community Wiki: http://colab.cim3.net/wiki/
Community Portal: http://colab.cim3.net/
To Post: mailto:hit-forum@xxxxxxxxxxxxxx    (01)
<Prev in Thread] Current Thread [Next in Thread>
  • [hit-forum] Fw: FYI Very Interesting -- on Semantic; Codification / Terminology Comment for HITSP BIO IS, Niemann . Brand <=