Wednesday, July 3, 2013

NIST Releases Draft Outline of Cybersecurity Framework for Critical Infrastructure

I like what NIST does regarding Security guidance. I know that they are a USA government body, thus those outside the USA have some reservation. I however find that they hit all the right buttons on their Security specifications. They are catching up a bit on their understanding of Privacy.

I have high hopes, but not too high, for their new Cybersecurity Framework. First, I am dissapointed that NIST would be dragged into a buzzword and forced to say "cybersecurity" as if it is a term that everyone knows totally. But, sometimes one must do the buzzword bingo
As part of its efforts to develop a voluntary framework to improve cybersecurity in the nation's critical infrastructure, the National Institute of Standards and Technology (NIST) has posted a draft outline of the document to invite public review and gather comments.
The Executive Order calling for NIST to develop the framework directs the agency to collaborate with the public and private sectors. The draft outline reflects input received in response to a February 2013 Request for Information, discussions at two workshops and other forms of stakeholder engagement.
 The framework so far is useless, but their approach is good. It will be Risk based, and leverage existing standards. This is music to my ears.
The draft outline and other documents related to the Cybersecurity Framework are available at http://www.nist.gov/itl/cyberframework.cfm.
The most informative part of this announcement is their presentations:

Sunday, June 30, 2013

Technology Trends in Healthcare IT

Summer is a good time to assess the existing trends and project which ones will continue through the year. I would define the technology trends I see in Healthcare IT as “Mobile”, “Privacy”, and “Services”.

Within them I do include specific technologies that are today’s buzzwords.
  • Mobile – portable Data, portable Applications, portable Profiles, portable Devices, portable Sensors, portable Sessions, portable Security-- mobile as a user experience – mHealth is the most used buzzword so far this year. 
  • Privacy – Respect the individual, be Transparent with the individual, and Security of systems and data – privacy as a user experience – The Patient and Provider MUST trust that what they offer to Health IT will be maintained properly. Trust is easy to get the first time, very quick to lose, and almost impossible to recover from.
  • Services – Not necessarily as formal as SOA, but much like the SOA religion. Mostly services that enable HIE. Anything that could be a reusable service is "designed as" a service. Any service can be a local or remote or cloud hosted service. I think the concept of “Services” is more descriptive than “Cloud”, although to get best buzzword points "Cloud" mostly gets the point. 
Note that these three technology areas are overlapping and highly related. However they are on different tracks and require different development and maturing.

As always I have an indexed list of my articles by topic areas

Saturday, June 29, 2013

Redaction fail - Lessons healthcare must learn from

Interesting article on the problem with redaction, possibly exacerbated by human redaction. The article focuses on Military/Political based classification systems and the de-classification system we have. This system is based on humans that are trained experts. Their training, expertise, and guidelines tend to be highly variable. The article even includes explicit examples where two different people came to exactly the opposite conclusions on redaction, thus their combination gave everything.

The best part of the article is the paragraph:
"The idea that you can slot all knowledge into neat little categories that perfectly overlap with our security concerns is already a problematic one, as Peter Galison has argued. Galison’s argument is that security classification systems assume that knowledge is “atomic,” which is to say, comes in discrete bundles that can be disconnected from other knowledge (read “atomic” like “atomic theory” and not “atomic bomb”). The study of knowledge (either from first principles or historically) shows exactly the opposite— knowledge is constituted by sending out lots of little tendrils to other bits of knowledge, and knowledge of the natural world is necessarily interconnected. If you know a little bit about one thing you often know a little bit about everything similar to it."
De-Identification focuses on the removal of identifiers. This is hard enough, but nowhere near as hard as removing intelligence. That is not to say that healthcare has it easy, we do have sensitive health topics that are just as hard to handle. We should just not equate government redaction systems with the systems we need in healthcare to support clinical trials, clinical research, public health reporting, etc.

See also: De-Identification

Monday, June 24, 2013

Internet User Authorization: why and where

User identity, authentication, and the user’s authorization given to applications; are a hot topic in security and privacy. The latest darling on the block is oAuth, championed by Google, Facebook, Twitter, LinkedIn, Salesforce.com, and Amazon.

This technology has just been profiled by IHE in the Internet User Authorization (IUA) profile that is out for public comment right now. http://www.ihe.net/Technical_Framework/public_comment.cfm#IT

oAuth is good for:

This technology specification has some advantages over others, but mostly in the space of:
  1. Internet facing web services
  2. Web services available to the public
  3. Applications that are installed on mobile devices using internet facing web services
  4. Functionality that interact with social networking
  5. Functionality that can utilize identity and authentication managed elsewhere
  6. Web Browser or RESTful API
This does not mean that it is good for everything, nor that it is limited to these functionality. It is just good to understand where a technology fits best.

Both SAML and oAuth act rather similarly from a browser experience, although oAuth has a functionality that doesn't exist in SAML. I oAuth the authenticated user can endorse an application (mobile app or web-service) as having the authority of the user. That is the user can delegate their identity to an application, that can use that identity transparently in the background. This also comes with the ability of the user to revoke that authority. It is this functionality that differentiates oAuth.

SAML is still a better solution for:

  1. Backend communications that need a user identity
  2. Business-to-Business communications
  3. Federation of Identity
  4. Federation of Access Control decision points
  5. When the Identity Provider is Active Directory – Active Directory Federation Services
  6. SOAP web-services, although supported on Web Browser and RESTful API
It is more likely that an organization will host a Active Directory for their users and enable Active Directory Federation Services exposing SAML Assertions; than they will use oAuth. This might be a maturity thing that changes over time, or it might be a long term reality. SAML Assertions are generally seen as more business focused, where oAuth is more consumer focused.

I have covered this a bit more in What User Authentication to use?

IUA --> SAML + oAuth

The IHE IUA profile profiled a join between these two worlds. An Option in the IUA profile utilizes an IETF draft rfc that defines how to carry SAML Assertions within the oAuth infrastructure. Thus bringing together the benefits of both, although bringing along baggage that might be seen as the combination of the negatives.

The IUA profile is truly just an IHE profile. It only strives to profile the underlying standard to meet Heathcare needs. It is NOT a tutorial on how to use oAuth, or how to write your code, or how to configure a directory. These are all well documented today, and/or are opportunities for others to fill in. The IHE profile is simply taking the needs of healthcare and showing how to utilize oAuth to achieve that.

It recognizes the default oAuth use of bearer tokens, which are really only useful in the case where the oAuth token issuer is the same organization as the web resource being approached. This is because a bearer token is opaque and thus means the web resource gets no information about the user, the access control decision must be made totally at the oAuth token issuing. This is an okay solution, but didn't really enable much in the way of Interoperability. So IHE acknowledged this mode and allows it, but doesn't really utilize it in the profile.

The IUA profile primarily used JWT token types. These token types are readable, as they are encoded using JSON attribute encoding. The IUA profile shows how to carry the typical healthcare specific information within this package: User Role, Purpose Of Use, etc.

And Lastly the IUA profile added an option that uses an IETF draft for carrying a SAML assertion. In the case of IUA, this SAML assertion is profiled by XUA. Thus this is an encapsulation of a SAML assertion inside oAuth : IUA(XUA). 

Public comment is open until July 3rd. Please get your comments in.

Tuesday, May 14, 2013

Security Tutorials on mHealth Security and Auditing - #FHIR

The two presentations that I gave at the HL7 meeting Wednesday afternoon “Free Security Tutorial”, and again at the Joint Security/EHR/FHIR/SOA meeting on Thursday; are posted on the HL7.org web site. They are:

Security Education: mHealth Security and FHIR

This presentation is made up of current viewpoint
on mHealth security basics, risk-assessment models, network communications security, and user identity and access management. This information is on the HL7 FHIR site, and will improve over the coming month. Front and center is the IHE-Internet User Authorization (IUA) profile, a profiling of oAuth 2.0. Much of the material I cover is also covered on my blog at the following:

Security Education: Security/Privacy Audit Logging and Reporting

Wednesday, May 1, 2013

De-Identification - Data Chemistry

The concept of de-identification is a reoccurring theme in my circles. The use of the term de-identification that I use is the broader term well beyond the constraints of HIPAA. I use the term de-identification to refer to the process of reducing risk of privacy or identity exposure through modifying the data. This includes using pseudonyms, known as pseudonymization; and also includes removing data elements, known as anonymization. Therefore De-Identification is made up of both Pseudonymization and Anonymization.

I am involved in much of the standards work in this space, actively working in IHE on a handbook and ISO on updates to the core standard on the subject. In all of these cases we are trying to make the 'art' of de-identification more measurable, repeatable, and approachable. Too often it is seen as too hard, more often it is seen as simple and thus mistakes are made. The goal I have is to make it clear.

Why De-Identify?

First, one must understand that de-identification is just a method of lowering risk. The only way to get risk to zero is to have zero data. Even one data-element that one might consider to be purely clinical data does narrow down the population. Just to indicate that the weight of the subject is 203lbs will tell you much about the subject, if that value is 3lbs and you know the subject is a premature-baby, and if it is 403lbs it is clear you have limited the population. The first point is that all data are potentially identifiable, some data are less so.

Second, one must recognize that some data are outright Direct Identifiers. These data are in no uncertain terms identifiers. Full-Name is the most obvious. A Direct Identifier is something that is publicly known (knowable), therefore full-addresses, phone numbers, credit-card-numbers, and drivers-license-numbers. These items clearly can't be included in the de-identified data set. So they each need to be identified as a risk to be mitigated.

There are also a class of data that can be used in combination with other data in the data-set to identify a subject. Such as postal-codes, sex, date-of-birth, hospital identifier, or date-of-procedure. These are risky to be left in, so they need to be identified as potential to be mitigated.

The task of De-Identification is much like chemistry, bio-chemistry sometimes. One must understand the elements and how they interact. One must use various tools to separate or modify the elements. Each chemical process results in something useful for the purpose it was created. Some combinations of chemicals are very volitile, others benign, but all must be given respect.

De-Identification Procedure

The procedure is simple. Ill include only the high-level, each step is more involved than I indicate here.:
  1. Identify what it is you want to do with data. This is your use-case. What are critical data attributes, and what are acceptable tolerances for each data attribute. You need to justify each element you want. You must also identify the acceptable level of risk, which includes assessment of the authorizations you have.
  2. Identify ALL of the data elements that you have. This is the data set that has not been de-identified. It might be a database, it might be a stream. You must identify all of the data, not just the data you are worried about. You then classify each attribute: Direct, Indirect, or simple data. Note that any unstructured data, otherwise known as free-text, must be considered Direct Identifier. 
  3. Apply Mitigations, in theory. Given the use-case details you created in (1) and the data-element inventory you created in (2); apply the de-identification tools. (a) Redact - delete element, (b) Fuzz - modify within tolerance, (c) generalize - broader terms, or (d) replace - pseudonym. These are clearly not all the tools but the large categories of tools.
  4. Assess risk, in theory. How correlated are the data to a subject? Is this level of risk acceptable to the policy identified in (1)? Don't change your policy, that is the easy way out. Continue to apply mitigations. If further mitigations results in data that are not useful to your use-case, then you might need to change something else. 
  5. Apply Mitigations to data-set and validate the results. As with any design-of-experiments one must be able to prove your theory. Is the resulting data just as de-identified as you expected? Is the resulting data useful for your use-case?
However well you have de-identified, recognize that there is residual risk that needs to be managed. This risk is often significant  thus requiring good security practices. Just because you think your data are de-identified, does not mean you don't need to protect it. Attacks against de-identified data only get better, they never get worse.

De-Identification is Contextual

I have said exactly this (De-Identification is highly contextual) before. the de-identification algorithm you  come up with will not be useful to a different use-case, or a different data-set. It might be, but the assessment needs to be made. The context behind the needs of the use-case are critical. Take only the data, and the fidelity of the data that you need. 

Gross De-identification

There are use-cases for doing a gross de-identification into a large data set, followed by secondary use-cases with their own further de-identification analysis. This is often done in population-health analysis, using gross de-identification to fill the population database. While re-assessing results of any sub-analysis of a specific population health epidemic. Clearly the large database needs to be protected quite strongly, I might say it needs to be protected just as well as a full fidelity database.

Summary

De-Identification is a technical tool. It is not a get-out-of-jail card. The resulting data set likely still requires some protection and safe handling.

Friday, April 26, 2013

Privacy Consent State of Mind

The space of Privacy Consent is full of trepidation. I would like to show that although there are complexity, there is also simplicity. The complexity comes in fine-details. The fundamentals, and the technology, are simple.

Privacy Consent can be viewed as a "State Diagram", that is by showing what the current state of a patients consent, we can show the changes in state. This is the modeling tool I will use here.

I will focus on how Privacy Consent relates to the access to Health Information, that is shared through some form of Health Information Exchange (HIE). The architecture of this HIE doesn't matter, it could be PUSH or PULL or anything else. The concepts I show can apply anywhere,  but for simplicity think only about the broad use of healthcare information sharing across organizations.

There are two primary models for Privacy Consent, referred to as "OPT-IN" and "OPT-OUT".

Privacy Consent of OPT-IN

At the  left is the diagram for an OPT-OUT environment. One where the patient has the choice to OPT-OUT, that  is to stop the use of their data. This means that there is a presumption that when there is no evidence of a choice by the patient, that the data can be used.

This model is also referred to as "Implicit Consent". The USA HIPAA Privacy Regulation is utalizes this model for Privacy Consent within an organization. It is not clear to me that this HIPAA Privacy Regulation 'Implicit Consent' is expected to be used outside the original Covered Entity. It is a model used by many states in the USA.

The advantages typically pointed to with this model is that many individuals don't want to be bothered with the choice, these individuals trust their healthcare providers. Another factor often brought up is that when health treatment is needed, the patient is often not in good health therefore not well capable of making decisions; this however focuses on legitimate uses and ignores improper uses. Privacy worries about both proper and improper access.

Privacy Consent of OPT-IN

At the right is the diagram for an OPT-IN environment. In an OPT-IN environment the patient is is given the opportunity to ALLOW sharing of their information. This means that there is a presumption that the patient does not want their health information shared. I would view it more as a respect for the patient to make the decision.

This model is used in many regions, even within the USA. With an HIE this  model will work for many use-cases quite nicely. Contrasted with the HIPAA Privacy use of Implicit Consent, which is likely a better model for within an organization. The two models are not in conflict, one could use Implicit Consent within an organization, and OPT-IN (Explicit Consent) within the HIE.

Privacy Consent Policy

The above models seem simple with the word "YES" and "NO"; but this is not as clear as it seems. Indeed the meaning of "YES" and the meaning of "NO" are the hardest thing to figure out. It includes questions of "who" has access to "what" data for "which" purposes. It includes questions of break-glass, re-disclosure, and required-government reporting. The "YES" and the "NO" are indicators of which set of rules apply.

The important thing is that there are different rules. The state of "YES" doesn't mean that no rules apply, there are usually very tight restrictions.  The state of "NO" often doesn't truly mean no use at all. There is usually some required government reporting, such as for the purposes of protecting public health.

Privacy Consent: YES vs NO

The reality of privacy consent is that there will be a number of patients that will change their mind. This is just human nature, and there are many really good reasons they might change their mind. A patient that has given OPT-IN authorization might revoke their authorization. A patient that has indicated they don't want their data to be shared might decide that they now do want to share their data. For example as a patient ages they recognize that they can be best treated if all their doctors can see all the other doctors information.

Thus what seems like a very simple state diagram for OPT-IN or OPT-OUT; one must recognize that they need to support transition between "YES" and "NO".

Privacy Consent of Maybe

Lastly, we all recognize that the world is not made up of 'normal' people. There are those that have special circumstances that really require special handling. This I am going to show as another state "MAYBE". This state is an indicator, just like "YES" or "NO", but in this case the indicator indicates that there are patient-specific rules. These patient-specific rules likely start with a "YES" or a "NO" and then apply additional rules. These additional rules might be to block a specific time-period, block a specific report, block a specific person from access, allow a specific person access, etc. These special rules are applied against each access.
Note that the state diagram shows transitions between all three states. It is possible that one goes into the "MAYBE" state forever, or just a while.

Privacy Consent is a Simple State Diagram

I hope that I showed that Privacy Consent is simply state transitions. I really hope that I explained that each state has rules to be applied when a patient is in that state. Implicit (OPT-OUT) and Explicit (OPT-IN) are simply an indicator of which state does one start in, which state is presumed when there is an absence of a patient specific decision. The rules within each state are the hard part. The state diagram is simple.

Other Resources


Patient Privacy controls (aka Consent, Authorization, Data Segmentation)

Access Control (Consent enforcement)