Securing Personal Portable Storage

by Bradley Berg     May 12, 2005

1.  Introduction

A Portable Storage Device (PSD) such as a USB Flash Drive or Minidisk lets
people carry personal data in a compact lightweight form.  We would like to
ensure that personal data remains confidential.  To this end the data needs to
be managed so that sensitive content has limited exposure, all data can be
reliably accessed and routinely backed up.

Computing professionals have a difficult time achieving this and it is easy
to lose important data as a result.  Ease of use is essential for
non-professionals to manage their data otherwise they will have no option but
to live with non-secure and unreliable data management practices.

This paper proposes mechanisms for authenticating access to confidential data
stored on a PSD.  Two tools for data encryption were implemented to evaluate
how the methods in different environments and to evaluate their ease of use.
The paper also describes additional tools for managing and backing up data
stored on a PSD.


2. Data Security

While portable storage is convenient to carry, it is also subject to theft or is
easy to lose.  People might choose to safeguard certain data such as a mailbox
in clear text form that they would normally store unprotected on a trusted
system.  When stored on a PSD such data can be saved in encrypted form to
prevent unwanted access.  It should be visible to the file system while working
on it, but encrypted while toting it around.

Computers are now online much of the time at both work and at home.  There is a
greater need to secure personal data exposed through internet connections.
Just using a PSD to store personal data instead a local file system
continuously connected to the internet provides a modest level of security.
The PSD only needs to be plugged in while it is in use, which limits the
time of exposure.  Setting system permissions so that the device can't be
accessed over a LAN or WAN limits many forms of malicious access.  This is the
default setting for Windows systems; users need to explicitly share removable
drives.

People also store some highly sensitive information that would cause serious
problems if it were in the wrong hands.  This includes such things as
information about financial accounts, computer passwords and confidential
personal information.  The general advice given to people is to not store this
information on a computer at all; which is not a bad idea.  However that is not
always convenient and has it's own management problems.  For instance people
can forget or lose passwords or leave them laying around on post-its.

People can store their most sensitive information on a PSD safely and
conveniently if some simple tools and data management practices are employed.
A primary copy can be maintained on the PSD that can be accessed only by
the user.


2.1  Security Provided By PSD Vendors

When you buy a PSD, vendors usually supply some data access mechanisms.  Data
access is granted through a password or with a thumbprint scanner.  While this
form of secure access is sufficient for many people, it has some limitations.

Thumbprint scanners provide a high degree of protection, but are overkill for
the level of security required by most people.  Thumbprint signatures are hard
to forge and can't be guessed like passwords.  They may also be easier to use
than a password as there is nothing to forget and they can be scanned faster
than a password can be typed.  As a security mechanism they are a functional
substitute for passwords.  Their cost is intrinsically higher, but may be worth
the cost to people who want to use vendor-supplied security.

These authentication mechanisms vary between vendors, but generally
authentication data is stored on the device itself and access is granted
through software provided by the vendor.  The authentication data is stored in
an area of memory not accessible through the file system.  At least some
vendors store protected data in encrypted form.  Vendors do not publish
their authentication mechanisms, so it is unclear how secure they are.
Depending on the implementation, the authentication mechanisms may be subject
to attack through reverse engineering.

Credentials management varies between operating systems.  Even amongst different
versions of Windows there is no common API for credentials management.
Consequently it is difficult for vendors to provide consistent authentication
mechanisms for even Windows platforms.  Instead vendors develop custom
authentication mechanisms so their devices work securely in different
environments.

Authentication mechanisms that are built into the device do not recognize that
the user has logged into a trusted system.  The user needs to enter a password
each time they want to access their confidential data.  Furthermore,
restrictions on the password (e.g. it must be numeric) might mean that a
familiar password can't be used and as such it can easily be forgotten.

Vendor supplied authentication mechanisms have a different user interface
for each vendor.  When you switch between drives the user interface changes
and different features are supported.  New device drivers or software might
need to be installed in order to use the device.  This might not be possible
in some environments due to access restrictions or system and software
incompatibilities.

As a convenience many PSD vendors partition the drive into protected and
unprotected partitions [1].  Most personal data is not confidential so it can
be stored in the clear text region and accessed without going though an
authentication process.  The size of the encrypted region is predetermined by
the owner and may not be adjusted as space needs change.  Given that PSDs have
limited space, this restriction becomes a problem when using the devices for
long term storage.

Once the drive is unlocked data on the drive is exposed to the file system.  If
the device is your primary device the encrypted data is exposed much of the
time.  Encrypting data on the device hides data if the device is lost or
stolen, but provides no protection while the device is on-line.  Consequently,
it is not recommended that highly sensitive information on encrypted
partitions.


2.2  Encrypting Data Using Utility Programs

This section describes utilities written for this project that encrypt files
and directories.  These utilities either operate in a trusted environment such
as a home and office or in a foreign environment such as a friend's computer or
a publicly available computer in a library or internet cafe.


2.2.1 Authenticating Access To Encrypted Data

In a trusted environment the utilities are installed on the local computer and
a password is retained by the operating system and used to access encrypted
data.  Subsequent execution of the utilities retrieve this password without
forcing the user to re-enter the password.  If the utility fails to
automatically unlock a particular file, then the file has been encrypted with a
password other than the saved password.  Only then is the user prompted for
a password specific to the file.  This way access can be restricted on a
per-item basis, however it is most convenient to use the same password for all
encrypted items.

Foreign environments are distinguished by the lack of installation elements
(e.g. Windows registry entries or stored credentials) resident in the
environment.  The utilities are not installed on foreign systems, but are
instead installed on the PSD.  This introduces the problem of authenticating
permission to run software installed on a PSD (see Section 4.1).  Since
passwords are not saved in a foreign environment the password must be
re-entered for each encrypted item.  Because we want to minimize the use of
confidential information in foreign environments this should not be much of a
problem.

The utilities were written using the Windows XP credentials manager for
authentication [2, 3].  The credentials manager provides the familiar Windows
password dialog box and securely saves user name and password combinations.  If
the user saves the password then the utility can retrieve it without prompting
the user.  Although the utilities need only a password and not a user name, the
interface forces users to enter a password.  Future work on authentication for
the utilities would be to find a way to avoid this.

Additionally, credentials manager only works with Windows XP service pack 2
and Windows 2003.  Alternative implementations are required to develop
utilities that will persist passwords in other environments [4].  Mechanisms
for some operating systems might not provide a secure mechanism for saving
passwords or might be easily cracked.  For those systems users may be
required to re-enter their password for each access.  If this happens
frequently these utilities may not be appropriate.


2.2.2  File and Directory Encryption

A utility was written for this project called Ezip to encrypt and decrypt files
and directories.  It uses the International Data Encryption Algorithm with 128
bit private keys [5].  If the user has saved their password in a trusted
environment the data can be encrypted and decrypted without prompting for a
password.  Data stored on a PSD is decrypted in-place to avoid retaining clear
text on a local file system where it can be retrieved by an untrusted party.
Even if the user had intended to delete it, they may be unable to due to a
system failure.

The decryption process consists of several stages.  Directories are unrolled
using the gnu tar utility and data compression may also be employed to save
storage space; which can be limited on a PSD.  To avoid writing intermediate
data between stages to any device, data pipes need to be used to connect the
stages.  This functionality is not yet implemented, but is essential to make
the Ezip utility practical.  Intermediate files take up space on a PSD, but
more importantly could be left behind if saved to a local file system.

The suffix, ".ez", is appended to files encrypted with the Ezip utility.
In a trusted environment the suffix can be associated with the utility and
the password saved.  In the Windows implementation this lets users simply
click on an an icon for an encrypted file or directory to decrypt it.

There is the possibility that a decrypted item will be left in clear text form
at the end of a session.  A tool could be written to track files that are
currently exposed and close them at the end of a session.  Before ending
sessions with a PSD, software must be run to flush data to the drive.  This
tool could run as part of the software used to close the session.  The list of
exposed items should be saved on the PSD so that it could pick up if a session
was terminated abnormally and the tool was not run.

In some cases the process of encrypting large files and directories may take
too long.  Also decryption exposes clear text to the files system (and
untrusted administrators), so this utility is not useful for highly sensitive
information that should never be saved to a file system as clear text.


2.2.3  Text Editor with Encryption

Ideally applications can encrypt and decrypt data without storing any clear
text on any file system.  In this project a plain text editor, Magenta, was
retrofitted with the Ezip utility.  Highly sensitive data can be stored in
encrypted form and accessed without ever being stored as clear text in the file
system.

Encrypted files with the ".ez" suffix are automatically decrypted when opened
with the editor.  Any files that were originally encrypted are re-encrypted
when saved by the editor.  Users don't need to remember to re-encrypt the clear
text as they did with the stand-alone Ezip utility.  Users need to explicitly
save a previously encrypted file in clear text form if that is what they want.

If the file was opened with a password other than the saved password the user
is prompted for the password again when saving the file.  This is required
because the password and its hash key were scrubbed when the file was opened.
This is also a good idea from the user's perspective.  Its clear to the user
which password is being used to save the file.

If the source file was also compressed, it needs to be run through a compression
filter when it is opened or closed.  As with the Ezip utility, the filter should
not save intermediate copies of the file on any drive.  Again pipes can be used
to avoid saving intermediate files.  This capability is not yet implemented.

The editor can decrypt directories and non-text files as well.  This way the
Magenta editor can be associated with the ".ez" suffix instead of the Ezip
utility.  The editor will open encrypted text file in an edit session.
If instead the editor is given an encrypted binary file or directory the item
will be saved on the PSD in clear text form.


3.  Data Integrity

As personal data migrates, different versions of files are retained on the
various computers and PSDs people use, which can cause confusion.
Interruptions in service can occur as well.  Local disks or PSD drives can fail
or be accidentally corrupted.  A PSD can be forgotten, stolen, or lost.
Connections to LAN or WAN storage may be interrupted.  These are all problems
of data integrity.


3.1  Reliability of the Drive and Interface

USB flash drives are usually preformatted with a Dos Fat16 file system.  In an
experiment with a Fuji USB 2.0 flash drive using Windows 2000 directories more
than seven levels deep, a limitation of the Fat16 file system, caused file
system corruption.  Concurrent access caused a Windows blue screen crash.  To
avoid these problems USB flash drives should be reformatted with a file system
other than Dos.

The Fat32 files system is recognized by all versions of Windows and also works
with Linux as well.  After formatting a Lexar JumpDrive for Fat32 the directory
depth problem went away.  Files were still corrupted under Windows 2000 when
accessed concurrently, but not under Windows XP.  Users need to verify that a
reformatted USB drive works properly on any system they intend to use with it.

Not all programs work with USB drives formatted with a Fat32 file system.  The
Microsoft Partner Pack [6] contains a utility, the "Microsoft USB Flash Drive
Manager".  It performs common operations needed to manage files on a Flash
Drive, such as data backup.  This utility did not recognize the JumpDrive
formatted with Fat32, but did recognize it when formatted with Ntfs.  Users
also need to check that USB drives work with the software they intend use.

Collectively problems with USB drives make them difficult to use in an
arbitrary environment and make them untrustworthy.  If their use is limited to
tested platforms people should be able to reliably use a USB drive.  In
untested environments, avoid concurrent drive access.

The serial ATA drive interface is reliable and eventually may supersede the USB
interface.  The same ATA device drivers used for internal hard drives are
employed which eliminates the concurrent access problems encountered with USB
drives.

The SATA I standard does not include power for external devices, but the
SATA II standard does with the eSATA [7] component.  The eSATA standard is
currently undergoing specification testing and is expected to be completed
in the summer of 2005.  The proposed eSATA connector is about 1" by 1/4".

The bandwidth of the SATA II interface is 300 MBps; while the peak USB 2.0 speed
is 60MBps.  In practice the actual bandwidth can be much lower, depending on
the device implementation and the computing platform.  The next generation of
Non-volatile RAM [8] is expected to have access times comparable to main memory.
This should allow the creation of PSDs with speeds exceeding that of hard
drives.


3.2  Web Distributed Storage

A backing storage source can provide access to data in the event that a
PSD is unavailable.  Several ongoing research efforts use WAN-based storage
systems that provide availability at multiple locations.  When coupled with a
PSD people can have consistent and reliable storage available nearly everywhere.

The PersonalRAID system [9] treats a PSD as a cache for multiple local drives.
Consistency is maintained using parallel file system index trees called a
log-structured file system.  Data is replicated over local storage at each
location where you work.  The file systems are synchronized by replaying
updates kept in file logs.  Checkpointing is used for disaster recovery.

PersonalRAID primarily addresses the problem of slow WAN access by using the
PSD as a cache.  Data is replicated for increased availability; acting as a
reliable backup service.  It does not treat the PSD as a principle storage
device.  The limited capacity of PSDs is not addressed in this work.

As with the PersonalRAID system the work by Tolia et. al. [10] uses a PSD as a
cache for a file system distributed over a WAN.  They cite the reliability
problems and availability limitations of PSDs as their reason for restricting
them to caching.  Unlike PersonalRAID, which has a page size granularity, this
approach uses a file granularity.  When data is available on the PSD then the
whole file is available and not just a fragment.

Still the main intent is to improve the inherently slow access times of
distributed file systems.  Their solution is to use a combination of local
storage caches and the PSD provides fast access.  In this case the PSD is
providing availability for most often used files when the distributed file
system is unavailable.  The user does not control which files reside on the
PSD, so some critical files may not be available through the PSD.

The Pastiche system [11] provides a distributed WAN backup system using
peer-to-peer network connections.  Data is stored redundantly over distributed
servers for high availability and reliability.  Pastiche is primarily an
outbound backup service.  The overhead as seen by the user is low.  Inbound
access is limited to data recovery and synchronization.

Pastiche stores data in blocks called chunks.  This is transparent to users as
the source file system (the PSD) is not affected by the granularity.  Hashing
is performed over chunks to detect matching or changed copies.  Data is
replicated on multiple peer-to-peer servers that cooperatively share unused
storage.  Data is encrypted for privacy in the distributed file system.  Users
avoids paying for costly remote backup services, but must provide a server and
storage space to join the network.

The Pastiche model could be further extended to propagate changes between
the WAN and a local file system.  Changes made on a PSD (say at home) would
propagate through the WAN to local storage at another site (say at work).
This would keep the local copies at both ends synchronized.  The amount
of synchronization required when attaching a PSD would be reduced by
the synchronization taking place through the WAN channel.

Multiple versions of files are kept to compensate for user errors, such as
unintended deletions.  This could be extended for PSDs to resolve collisions
when synchronizing file systems at different locations.  Collisions occur when
the same file is updated separately on a disconnected PSD and on a computer
connected to the distributed file system.  Such inconsistencies need to be
detected so the user can merge the changes.

An ephemeral backup system saves changes to files automatically when they are
written.  This requires that the operating system provide support to notify the
ephemeral backup system whenever a changed file is closed.  Secure access to
the changed files must also be granted.  If the backup system is distributed
over a WAN then a coherent and up to date version of files can be accessed from
multiple locations.


4.  Authenticating Software Packages

Software installed on a PSD needs to work on multiple computers.  This is
contrary to the distribution model where software is deployed on specific
computers.  This complicates the installation process for software vendors.
Users might be unable to install specific software packages on a PSD or have a
difficult time getting the software to work in multiple environments.



4.1  Independent Software Vendors

When installing software a product key is often required to authenticate your
right to use the software.  Given that any competent programmer can modify a
program to bypass such checks, this is merely a deterrent.  The ultimate
protection mechanism is the legal copyright.

Software vendors use a variety of obfuscation methods that are difficult to
reverse engineer [12].  From a security perspective this is a weak method of
authentication.  Often vendors will hide access information in rarely visited
portions of the file system or in the Windows registry.  This presents a
problem when installing software onto a PSD.  There are very few places to hide
such information.  Some form of standardized hardware and operating system
support for authentication may be required to make vendors more comfortable
with the idea of deploying to a PSD.

Legacy software installation programs may preclude installing software on a PSD.
Furthermore software vendors may not want their software installed on a PSD.
The traditional licensing model binds installed software to computers and not
just the disk drive.  As PSDs become more commonplace then perhaps binding
software licenses to people and not computers makes more sense.

Legacy software can be installed in trusted environments and it can access data
on a PSD just like any removable drive.  However legacy software is often
prohibited from installing software in foreign environments.  Either the user
doesn't have appropriate privileges or is not allowed to pollute the the
environment with outside software.  Even if the software was uninstalled after
the session; many software packages have poorly written uninstallers that leave
behind remnants.

To operate in a foreign environment, software needs to be installed on the PSD
itself.  Software for Windows is typically installed on a local file system and
settings are made in the registry.  Software installed on a PSD needs to work
on multiple platforms; each with their own registry.  Vendors need to make
explicit changes to their installation programs before software can be deployed
on PSDs.


4.2  Installing an Operating System on a PSD

Operating systems can be boot loaded from a PSD. When running the OS from a PSD
legacy software can also be installed on the PSD just like any drive.
Additionally the system will be configured with the user's preferences
and security credential wherever it is used.

Difficulties arise when booting an operating system in a foreign environment.
Operating systems are installed for particular hardware configuration, not
multiple configurations.  It's unlikely that any arbitrary computer will be
able to boot from your PSD.  With some effort a person should be able to set up
an operating system on a PSD that will boot load in both a home and work
environment.  To boot load an operating system [13, 14]:

  *  The BIOS running on the mainboard in the host system must support
     booting from the PSD's interface (e.g. USB).

  *  The PSD must support boot loading.

  *  The operating system must permit installation on a PSD and be configured
     for hardware on the target host computers.


5. Conclusions

Using a PSD as a primary storage medium currently requires careful data
management and system administration.  Hardware and operating system
dependencies prohibit or limit usage to a few controlled environments.

With effort a computer savvy person could use a PSD for many primary data
storage requirements.  Usage must be verified for each environment and data
must be carefully secured.  The tools developed for this project can provide
data security with convenient authorization in trusted environments.  In
particular, the text editor provides secure access to highly sensitive data
even in untrusted environments.

In the near term emerging technologies show great promise for making PSDs more
reliable and more ubiquitous.  Drives using the eSATA interface will not have
the data integrity problems of the USB interface.  New non-volatile memories
will have unlimited rewrites and will be faster and cheaper than current flash
memories.  Still it will take several years for these technologies to gain
widespread use.  In the meantime early adaptors will be able to deploy them at
home and in the office.

Even though the technology is in place, systems integration issues will
continue to be the limiting factor.  Legacy applications and operating systems
will always be problematic.  People can work around this if they have ability to
cleanly boot load an operating system from arbitrary computers.  It's unclear
that anyone will be integrating secure ephemeral web-base backup with any
operating system anytime soon.  Operating systems and PSD vendors will need
to implement standards for installing software on a PSD.

In a lecture by Bill Poduska, he observed that the computing industry advances
in steps.  Every decade or so a confluence of three emerging technologies yield
a leap in computing capability.  This may be happening with the convergence of
a new generation of non-volatile memory, miniature hard drives and the external
SATA II interface.  Low cost and highly reliable PSDs will have capacities on
the order of 100GB and speeds far exceeding that of current hard drives.  As
data shifts from local file systems to PSDs we may say that the pen drive is
the computer.


BIBLIOGRAPHY


[1] Lexar USB Flash Drives - JumpDrive Secure
      http://www.lexar.com/jumpdrive/jd_secure.html

[2] Michael Howard and David LeBlanc. Writing Secure Code.
      Microsoft Press November 2001

[3] Credential Manager OS Design Development
      http://msdn.microsoft.com/library/default.asp?url=/library/en-us/wcesecurity5/html/wce50conCredentialManagerOSDesignDevelopment.asp

[4] Gary C. Kessler. Security in Windows NT. November 1997
      An edited version of this paper appeared with the title
      "How to Improve Windows NT Security" in Network VAR February, 1998.
      http://www.garykessler.net/library/ntsecurity.html

[5] Bruce Schneier. The IDEA Encryption Algorithm.
      Dr. Dobbs; December 1993; volume 11, number 12.

[6] Microsoft Partner Pack for Windows XP
      http://www.microsoft.com/windows/partnerpack

[7] Silicon Image Whitepaper. External Serial ATA. September 2004
      http://www.sata-io.org/docs/External%20SATA%20WP%2011-09.pdf

[8] Bradley A. Berg. New Computers Based on Non-Volatile Random Access Memory
      July 18, 2003
      http://www.techneon.com/paper/nvram.html

[9] Sobti S.; Garg, N.; Zhang, C.; Yu, X.; Krishhnamurthy, A.; Wang, R.
      PersonalRAID - Mobile Storage for Distributed and Disconnected Computers.
      Proceedings of the First USENIX Conference on File and Storage
      Technologies (Moneterey, CCA Jan. 2002)
      http://www.usenix.org/publications/library/proceedings/fast02/full_papers/sobti/sobti_html

[10] N. Tolia, J. Harkes, M. Kozuch, M. Satyanarayanan.
      Integrating Portable and Distributed Storage.
      Proceedings of the 3rd USENIX Conference on File Storage Technologies.
      San Francisco, March 2004.
      http://www.pdl.cmu.edu/PDL-FTP/DC/integratingpds-fast04.pdf

[11] Landon P. Cox, Christopher D. Murray, and Brian D. Noble.
      Pastiche: Making Backup Cheap and Easy. USENIX Association,
      5th Symposium on Operating Systems Designs and Implementation
      http://www.eecs.umich.edu/~lpcox/osdi02.pdf

[12] Serge A. Sereda. Software Protection Systems Efficiency Estimation.
      Academy of Economic Studies of Moldova,
      Department of Cybernetics and Economic Computer Science
      http://www.ase.md/~osa/publ/en/puben09.html

[13] USB Flash Drive Alliance. USB Flash Drive Overview.
      http://www.usbflashdrive.org/usbfd_faq.html#bootable

[14] Recommendations for Booting Windows from USB Storage Devices.
      http://www.microsoft.com/whdc/device/storage/usb-boot.mspx