InternetCafe Project - Requirements - Subproject 6

Java Distributed Backup System - Index of contents

Section	Title
0.0	Vision
1.0	Requirements
1.1	Independence from the platform
1.2	Sharing disk's space availability between users
1.3	Secure data access and storage
1.4	Smart data upload and retrieval
1.5	An uncoupled and easy extendible component
2.0	Requirements Analysis
2.1	Functional requirements' analysis
2.2	Not functional requirements' analysis
2.3	Glossary
2.4	JDBS Offline context
2.4.1	Encrypt backup content with a symmetric cryptographic approach
1.4.2	Signing backup artifacts with an asymmetric cryptographic approach
2.4.3	File compression utility
2.4.4	Global unique identifier generator
2.5	JDBS Online context
2.5.1	Peers' discovery protocol
2.5.2	Backup storage on peers protocol
2.5.2.1	Free space request phase
2.5.2.2	Free space response phase
2.5.2.3	Upload phase
2.5.3	Backup retrieval protocol
2.5.3.1	Lookup request phase
2.5.3.2	Lookup response phase
2.5.3.3	Ownership check request phase
2.5.3.4	Ownership check response phase
2.5.3.5	Download phase
2.6	Use cases and scenarios
2.6.1	Use Case - Backup artifact creation
2.6.2	Use Case - Distributed backup storage
2.6.3	Use Case - Distributed backup retrieval
3.0	Analysis
3.1	Problem Analysis
3.2	Structural Analysis
3.3	Behavioral Analysis
3.4	Interaction Analysis
3.5	Risks Analysis
4.0	Logic Architecture
5.0	Project
5.1	Projects choices
5.2	Patterns
5.3	System structure
5.4	System behaviours
5.5	System interactions
6.0	Architecture
7.0	Conclusions

0.0 Vision

Since disk drives are cheap, backup should be cheap too. Of course it does not help to mirror your data by adding more disks to your own computer because a virus, fire, flood, power surge, robbery, etc. could still wipe out your local data center. Instead, you should give your files to peers (and in return store their files) so that if a catastrophe strikes your area, you can recover data from surviving peers. The Java Distributed Backup System (JDBS) is designed to implement this vision.

1.0 Requirements

Those are the requirements for a system that offers the facility to store data on peers and retrieve it managing a distributed set of backups. The main goals of this project are:

Independence from the platform.
Sharing disk's space availability between users.
Secure data access and storage.
Smart data upload and retrieval.
An uncoupled and easy extendible component.

Note that this is not a p2p file sharing, this system provides a way to store data on multiple peers and retrieve it with the previous knowledge of what we are looking to get back. The system will automatically retrieve a definition of available peers (those that are sharing disk's space availability) searching in a specific neighborhood. Each peer will have available a descriptive set of stored data and some knowledge about the data's owner. Each point mentioned in this section will be described separately below in this document. A possible network's scenario is presented in the following figure:

JDBS - Network Scenario

In the presented figure it is shown the possibility to work inside a LAN (Local Area Network) or to work with peers placed everywhere in the World but still accessible trough Internet. This context can be described as follows:

Any User that has to backup his data storing it safely, must provide, using JDBS, a compression of such data and choose from a list of available peers those where he want to upload it. The user must provide to JDBS a key from his key ring with which the system can encrypt the user's backup artifact.
Any User that has to retrieve a backup from the available peers network must provide a searching key to the system and just retrieve the backup from the neighborhood, probably the most near peer or the peer with biggest bandwidth.

Below is placed a short but detailed description of each requisite mentioned in this section.

1.1 Independence from the platform

In order to grant the community of a well done distributed backup system tool JDBS must be provided for each platform. To be smart and avoiding code replication e/o platform dependent uncoupled solutions we must think at JDBS as written in an interpreted language and demand the multi-platform requirement solution to such language. Thinking to the system in such terms, actually, doesn't offer other chances than use a Java solution.

1.2 Sharing disk's space availability between users

The simple idea that stays on the JSBS basis is: if a particular user wants to store his backups on other peers he (having the chance) must store someone else backups on his peer machine. This short explanation introduces some constraints on the system usage in fact it introduces the idea of a particular "quota" of available disk's space associated to a particular user. More the user wants to store more space the user must share to others. Another idea introduced by this formal requirement is that each backup has its life cycle, for example some data has to be retrievable for long periods (we are talking of data subject to law's constraints that must be stored for years rather than personal data that has a smaller life cycle). The system must provide a way to declare data obsolete after such a period in order to permit to the various involved peers to release the occupied space and declare as available space the one that was occupied before.

1.3 Secure data access and storage

Because we are not in a simple p2p context with simple idea of file sharing and open data access, the system, must provide a way to protect user's data. No one else, excluding the data's owner, must read, navigate or modify the stored data; peers are only seen as data repositories. To do that JDBS has to provide a way to encrypt data with a possible two-way symmetric and asymmetric key approach. The mentioned approach should obviously use an open algorithm to accomplish that security requirement; the basic idea is to encrypt each backup's content (file) with a symmetric key and than sign the backup artifact with an asymmetric algorithm approach. There are many symmetric and asymmetric algorithms that could be used. The security's problems on data managing that JDBS must solve and that are involved in this requisite are the typical: Availability, Integrity, Authenticity and Confidentiality.

1.4 Smart data upload and retrieval

JDBS should provide a light usage of network's resources and a well-done information retrieval basic architecture (this is needed to search stored data on peers). The system should also integrate a subsystem to plan data moving between the peer nodes, an example can be data moving during the nocturnal hours doing some considerations on the peer's geographic locations. Because the upload and retrieval are traffic intensive will be taken as a fact that each peer must have the availability of a medium or large bandwidth Internet connection; this problems doesn't seems to affect LAN users where assuming large bandwidths is not a so hard restriction.

1.5 An uncoupled and easy extendible component

Because of the imminent integration of JDBS in the InternetCafe project it should provide a well-defined set of interfaces and a well-done set of configurable parameters in order to make easy the component's usability by third party applications providing a flexible way to be extended by users and developers.

2.0 Requirements Analysis

This section will be described the analysis of the problem as is and as it is explained in the "Requirements" section. We will also introduce some considerations about functional and not functional requirements, define an exhaustive set of scenarios and JDBS use cases and delineate a glossary for those terms that recurs more and more in the domain language.

2.1 Functional requirements' analysis

Because of the clear explanation of such functional requirements in the previous section here will be placed a reference to the sections placed below in this document whose names are respectively "JDBS offline context" and "JDBS online context". Those sections will explain and analyze the static context associated to the offline utilization of JDBS and of the dynamic online context. Anyway the functional requirements for JDBS will be exactly the goals described in the previous requirements section excluding from those what is a not functional requirement that are be explained in the respective section that follows this one. A functional goal that must be described in this section is the one that refers to "Sharing disk's space availability between users" key. Because users will be able to upload their backup artifacts on other peers JDBS must grant and make safe the fact that each user will share a relative amount of disk's space to other users in the community. A good amount of space grant by default by each user should be quantized in order of percentage of available space on disk on the peer at the moment he will install JDBS. Anyway the user will have the obvious possibility to share space according with his machine configuration basing the amount of shared disk space also on some considerations about the user's created backup's size. Anyway JDBS will reside on the basis of a trustworthy network of participants so it is a fact that if any user will not share available space to the community this happens because he really doesn't have such space to share.

2.2 Not functional requirements' analysis

Looking at the previous section can be found and underlined some not functional requirement that JDBS must respect in order to obtain a successful coverage of each requisite. According with what has been previously explained the keys not functional requirements are independence from the platform that introduces a clean usage of Java as paradigm and programming language and another requirements will be intend JDBS as a component to integrate into a bigger system or software application. Because the context of application of JDBS is the Internet community and because each user has his own set of skills and language should be thought the possibility to introduce another requirement: the multi-language support.

2.3 Glossary

In this section we will present a short description for each term found in the previous requisites document. This is done to avoid the possibility to misunderstand a term when we are using it in the following sections.

TERM	MEANING	SYNONIMS
Algorithm	A step-by-step problem-solving procedure, especially an established, recursive computational procedure for solving a problem in a finite number of steps. In this context the term algorithm stands for a way to encrypt or sign user's backup artifacts.	Symmetric cipher, Asymmetric cipher.
Information retrieval architecture	Stands for a way to index the user's backups in order to smartly retrieve those backups from other peers.
Authenticity	The quality or condition of being authentic, trustworthy, or genuine. In this context refers to the authenticity of a backup stored on a peer. A property of data managed or treated by JDBS.	Ownership.
Availability	Stands for the quality of being at hand when needed or the degree to which a system suffers degradation or interruption in its service to the customer as a consequence of failures of one or more of its parts. A property of data managed or treated by JDBS.	Life cycle.
Backup	The term commonly refers to a copy of the files on a computer's disks, made periodically and kept on magnetic tape or other removable medium (also called a "dump"). Most new computer users neglect this essential precaution until the first time they experience a disk crash or accidentally delete the only copy of the file they have been working on for the last six months. Ideally the backup copies should be kept at a different site or in a fire safe since, though your hardware may be insured against fire, the data on it is almost certainly neither insured nor easily replaced.	User data, Artifact, File.
Bandwidth	The data transfer capacity of a network. It is measured in bits per second.
Community	Stands for the group of people that will use, contribute or take part in some way to the JDBS project.
Component	Stands for a description of JDBS as a software artifact that is one of the individual parts of which a composite entity is made up, especially a part that can be separated from or attached to a third party system.
Confidentiality	The ethical principle or legal right that makes data private or inaccessible to others if the owner doesn't consent or permits disclosure. The state of being secret or discretion in keeping secret information. A property of data managed or treated by JDBS.
Configuration Parameter	A set of parameters to make configurable in order to make smarter the usage of JDBS by another system or third party application.	Interface.
Connection	A means or channel of communication between computers or peers placed in the same network. The state to being connected to Internet or to another machine in a LAN.	Internet connection, Network's resource.
Developer	Who develops or programs something that uses or regards JDBS extending it in parts or adding new features to a third party application.	Programmer.
Disk	Computer hardware that holds and spins a magnetic or optical disk and reads and writes information on it.	Hard drive.
Disk space	A peer's hard disk amount of space.	Quota.
File	A set of related records kept together. It is the basic unit of a composite user's backup.	Data file, Backup, Configuration file.
Geographic Location	It is relative to the collocation of peers in JDBS network. A peer is in the same geographic location if he can simply reach another one that is placed in the same geographic area.	Network.
Integrity	The state of being unimpaired; soundness or wholeness. A property of data managed or treated by JDBS.
Interface	A boundary across which two systems communicate. An interface might be a hardware connector used to link to other devices, or it might be a convention used to allow communication between two software systems. Often there is some intermediate component between the two systems that connects their interfaces together.	API.
Internet	An interconnected system of networks that connects computers around the world via the TCP/IP protocol.
Java	A trademark used for a programming language designed to develop applications, especially ones for the Internet that can operate on different platforms.	Interpreted language.
Key	A unique string that represents an object and distinguish it from another.	Search key.
Key ring	Commonly a file where a set of keys is holds or stored.	Public key, Private key, Simmetric key.
LAN	Local Area Network. A system that links together electronic office equipment, such as computers and word processors, and forms a network within an office or building.
Law	A rule of conduct or procedure established by custom, agreement, or authority.
Life cycle	A progression through a series of differing stages of data availability. It is concerned to the period that regards data or backup storage on JDBS peers.	Data life cycle, Backup life cycle.
Neighborhood	A surrounding or nearby region. It is commonly referred to each peer that resides in a near geographic location.	Peers, Network, Geographic location.
Network	Means a set of computers connected to each other between a physical link or connection layer.	Internet, LAN.
Peer	A unit of communications hardware or software that is on the same protocol layer of a network as another. A common way of viewing a communications link is as two protocol stacks, which are actually connected only at the very lowest (physical) layer, but can be regarded as being connected at each higher layer by virtue of the services provided by the lower layers. Peer-to-peer communication refers to these real or virtual connections between corresponding systems in each layer.	Machine, Computer, p2p, Node.
Platform	The basic technology of a computer system's hardware and software that defines how a computer is operated and determines what other kinds of software can be used.	Multi-platform, JAVA, Platform independent.
p2p	Peer to peer allows internet users to transfer files directly, rather than through the use of a website or directory. File transfers are done directly from the users' computers.	Peer to peer.
Quota	A proportional amount of space share or assigned to a group or to each member of JDBS network. The quota of disk space of each peer or user is strictly related to the space occupied form the particular user on other JDBS network's participants.
Repository	A facility where things can be deposited for storage or safekeeping.	Disk.
RSA	A cryptography algorithm based on public-private key. In this context refers to the safety and other security characteristics of JDBS managed data.	Asymmetric algorithm.
System	A group of interacting, interrelated, or interdependent elements forming a complex whole.	Application, Component, JDBS, Tool.
Scenario	A context of JDBS usage or a set of actions that should be performed in a particular context.	Context.
Subsystem	It is referred to a particular JDBS contained subsystem that plans for data moving between network nodes.
Third party application	Stands for an external application that can use JDBS as system to store its data backups. JDBS should be used as a software component in other applications.	InternetCafe, Third party application, Third party system.
Usability	The effectiveness, efficiency, and satisfaction with which users can achieve tasks in a particular environment of a product. High usability means a system is: easy to learn and remember; efficient, visually pleasing and fun to use; and quick to recover from errors. A property of JDBS.
User	The user that will use JDBS. Someone doing "real work" with the computer, using it as a means rather than an end. Someone who pays to use a computer. A programmer who will believe anything you tell him. One who asks silly questions without thinking for two seconds or looking in the documentation. Someone who uses a program, however skillfully, without getting into the internals of the program. One who reports bugs instead of just fixing them.	Participant, Data owner, Backup's owner, JDBS user, Peer owner, Node owner.

2.4 JDBS Offline context

In this section will be evidenced the offline features that should be provided by JDBS. Such features corresponds at functions described in the JDBS requisite section and are discussed here in order to analyze the problem under the more functional and analytic point of view. The functionalities that JDBS must offer as an offline application are:

A utility to perform file encryption based on a symmetric algorithm.
A utility to sign backup artifacts with a public-private key algorithm.
A compression utility to create the final backup's file artifact.
A utility to build a global unique identifier to associate to each backup artifact.

In the above figure is placed a block diagram on how those components will be connected in order to generate a valid backup artifact:

JDBS - Data to signed backup artifact

A descriptive but not exhaustive analysis of those functionalities is given in the bottom of this section in the relative paragraphs.

2.4.1 Encrypt backup content with a symmetric cryptographic approach

The main role of this symmetric encryption utility will be to realize a secure, password based, encryption of each file present in a backup artifact. To do so the utility must provide in input a user's password and one or more files in order to output an encrypted or decrypted version depending of the encrypted or plain form of such an input file. A figure that represents a block vision of such a until is presented above:

JDBS - Symmetric cryptographic approach

The first block presented in the figure represents the case in which a file is passed in input and it is in plain form, once the user has provided his password the file will be encrypted whit it and the block outputs it in ciphered form. The Encrypted file will be no more accessible if not to the user's owner. The second block represents the opposite case in which the input file is encrypted and the output of the block (if the user's provided password is valid) is the "plain" form of the file.

2.4.2 Signing backup artifacts with an asymmetric cryptographic approach

Because we have learned that JDBS will be implemented in JAVA to satisfy the not functional requirement described in a previous section we must search here if something is still available to perform such signing facility on such development platform. Thanks to some search on Internet we found a valid and usable already done implementation of the RSA algorithm in the JCE (Java Cryptography Extension) and we think it will be suitable for the purpose. RSA should be easy to use in our case but few problems remains not defined in our domain. Because a pair of RSA public key and private key can be generated on demand by JDBS we must define some rules or best practices to do it. The problems that in this first analysis can be found are:

Once we have generated a key-pair where we must store the private key in order that it will remains totally secure?
Where we must store the RSA public key in order to left it easily accessible by JDBS?
How many bits are involved (1024, 2048) with such key-pair generation? What is sure is that such a problem also involves the backup's specific life cycle.
According to the most suitable configuration taken, what are the timings of backup artifacts signing?

What is sure at this time is that JDBS must provide a way to sign backup artifacts after a symmetric encryption of such backup data (contained files) and there will be a component whose role is doing that. A block vision of such asymmetric cryptographic utility is presented in the above figure:

JDBS - Asymmetric cryptographic approach

The first block shows how the Backup Artifact will be signed and the second block shows how will be checked the sign against a provided public key. The presented problems will be taken in consideration when we will start the project phase for now those problems must not be taken in consideration.

2.4.3 File compression utility

Its obvious that the chosen platform will permit a compression of user data treating it as files but it is not obvious what kind of compression algorithm must be used in JDBS, has it to remain as a user's choice? Some well-known algorithm can be the Zip, Gzip, tar etc. A decision on how JDBS will work with such algorithms will be taken in consideration as soon as the project phase will start. In the above figure is presented a block vision of such compression utility:

JDBS - File compression

The block in the figure shows that the component takes in input a compression algorithm and some files or folders that must be compressed with such an algorithm.

2.4.4 Global unique identifier generator

Because any user will have its set of local and distributed backups JDBS must implement a sort of sub-component that realizes a so-called GUID generator. The problem subsists because can be found some easy cases of backup's name overlaps. Think at the scenario in which two users uploads on another peer two backups with the same name, this fact will be surely dangerous for both users making inconsistent one or both of those backup artifacts. The problem doesn't occur during the upload process but it is evidenced during the backup's retrieval one! A good naming strategy for user's backup artifacts could be something that makes the backup unique in the JDBS network and not only inside a peer's neighborhood. Well, a probably good identification of a GUID in JDBS can be obtained granting to users the chose of their own names for backup's artifacts and demand directly to JDBS the creation of unique identifiers. A good pattern should be use a hash function (also if it doesn't prevents overlaps) applied to the user's chosen backup's name to assign a name to the backup file. Its also possible, because we have a unique user's nickname to create a folder on every peer found in the neighborhood named equally to the user's nickname. In the above figure is placed a block vision of such utility:

JDBS - Global Unique Identifier Generator

2.5 JDBS Online context

In order to fix ideas on the domain here will be analyzed and placed a detailed description of each phase involved in the distributed storage process (online upload or local storage and online download) and on how is managed the network architecture that permits peers discovery and the data retrieval model used to interrogate peers. A scenario that seems to be applicable to JDBS is an Ad-Hoc mobile network where peers collaborate to accomplish the user's backup distributed storage goal and where groups of users (peers) can be dynamically created and managed. We will focalize in this section the idea of node discovering and as a framework of support can realize that application area providing node's description and well-done localization.

2.5.1 Peers' discovery protocol

Because JDBS is a totally distributed network of peers probably will be useful to find a pre existing environment or middleware rather than define a new discovery protocol. Because in p2p context the problems to affront are ever and ever the same (discovery, peers searches, group management, etc?) will be better use a well-done and already tested system rather than project a new one from zero. The peers' discovery protocol will probably reference the JXTA one but for now we must concentrate on other more relevant things passing such a problem or chose to the JDBS project section.

2.5.2 Backup storage on peers protocol

The following subparagraphs and figures show a network context with some discovered peer machine during an online backup's upload process. The backup storage on peer machines protocol requires in order to be done of an already valid JDBS backup file. To grant this prerequisite the user has previously built trough JDBS a backup artifact whose file content is encrypted with the user's RSA private key. This process substantially consists of 3 different phases:

The free space request phase;
The free space response phase;
The upload phase.

For each phase will be given a detailed description in the above paragraphs.

2.5.2.1 Free space request phase

This is the so-called "Free space request phase" of the protocol in which the user's peer node asks the other peers for available disk space to store the user's backup artifact. A figure that shows that fact is placed below:

JDBS - Free space request

The Peer Node 1 is the JDBS network's node on which resides JDBS application and from which starts the protocol. The other nodes shown in the figure are other JDBS nodes placed elsewhere in the JDBS network and, in this specific phase, are able to accept requests because are visible in the Peer Node 1 neighborhood. If no peer is visible in the neighborhood than JDBS will be constrained to work locally and the distributed backup can't be reached in this stage.

2.5.2.2 Free space response phase

This is the so-called "Free space response phase" in which any peer returns a sort of "yes" or "not" response to the asking peer. A figure of that phase is provided below:

JDBS - Free space response

This phase occurs only if phase 1 so-called "Free space request phase" has been performed. In this phase, discarding network problems, can occur some case:

No peers have been found for such backup storage.
Some peers have been found but the replication degree has not been satisfied.
A valid set of peers can accept the user's backup artifact.

In the first case there is a problem because the backup can't be stored online in any peer rather than locally in the user's node (in the figure Peer Node 1).

In the second case some peer is available for such storage but the number of peers found is not enough to satisfy the chosen user's replication degree constraint. The problem can be solvable demanding the attainment of replication degree in a successive stage; the user can perform a less restrictive (in terms of replication degree) upload to the available peer machines.

The third case is the normal functioning one where peer machines has been found to store the user's backup artifact and has reached the replication degree chosen by the user.

2.5.2.3 Upload phase

This is the final and third phase called "upload phase" in which the asking peer, according with a replication degree chosen from the user will try to upload the backup file on the target peers. The target peers should be chosen according with some parameter like connection bandwidth, ping timings etc. Note that the disk icon represents the user's backupbackup plus a copy of the user's asymmetric public key (this is needed in the backup retrieval protocol to check the artifact's ownership) and the various peers are machines of any type placed elsewhere in the JDBS network. The figure that shows that phase is placed below:

JDBS - Upload phase

Once the phase 2 so-called "Free space response phase" has been performed successfully the phase 3 becomes feasible. Some problems can occur during this phase is it the case of peer fall and lost of connectivity. In both cases we can reach an invalid state because probably can't be grant the distribution of the user's backup artifact or the replication degree. The first mentioned problem could be solved trying to research the peer or storing the backup locally otherwise the second problem that regards the replication degree can be satisfied in another stage. If no one of those mentioned problems occurs the upload phase just sent the backup file to the selected set of peer machines.

2.5.3 Backup retrieval protocol

The following subparagraphs and figures show a network context with some discovered peer machine during an online backup's retrieval process. This process is made of 5 phases:

The lookup request phase.
The lookup response phase.
The ownership check request phase.
The ownership check response phase.
The download phase.

For each phase will be give a detailed description in the above paragraphs.

2.5.3.1 Lookup request phase

This is the first phase of the retrieval protocol in which the user after he has provided a search key for the backup of interest and his RSA public key asks to every peer in the neighborhood if he has a valid entry for such searched backup artifact.

JDBS - Lookup request phase

This phase is the entry point for the "Backup retrieval protocol" and provides the stage in which a user starts searching for his (presumably) backup artifact. How can be done a searching string for that backup has been described previously in the section named "JDBS offline context". In this phase can occur unfortunately the major problems because if no peer has been found in the neighborhood the user can't retrieve his backup and unfortunately this is a not completely solvable problem (the user should use his local backup copy ?) because the only opportunity is to try to search again for those peers that contains the searched backup artifact. Anyway once this phase has been performed the retrieval protocol can proceed.

2.5.3.2 Lookup response phase

This is the second phase of the retrieval protocol. In this phase the user's peer is waiting for other peers responses, those responses will indicate peers where the user's backup artifact has been localized and from which it can be restored (downloaded).

JDBS - Lookup response phase

As described in the previous protocol's phase if no answer from the JDBS network or from the neighborhood is received the protocol fails and no solution can be taken rather than use the local backup copy (ever if it exists).

2.5.3.3 Ownership check request phase

This protocol's phase is the one that should grant security and prevent abuse of peer's connection bandwidth. Once the previous phase has been accomplished successfully and the user has provided JDBS with his asymmetric private key (note that this is a prerequisite of that protocol) JDBS will try to send a private key's encrypted packet of data to the chosen peers. Doing so each peer can check the identity of the user checking the encrypted data against the asymmetric public key associated to the backup artifact the user want to retrieve. A figure that represents this protocol phase is placed below:

JDBS - Ownership check request

Locally to each peer machine (in the figure those machines are Peer Node 2 and 4) will be checked the encrypted data sent against the backups' associated public key so the authenticity of the user will be exactly checked granting also the backup file's integrity property. Forgetting for now the network problems that can occurs no other error can occur in this phase.

2.5.3.4 Ownership check response phase

The main role of this phase is to send a sort of "yes" or "not" response to the peer node that has started the protocol (Peer Node 1 in the figure placed below). At this point if the user has provided a correct encrypted data packet and the receiver node has correctly checked it using the user's public key associated to the artifact the phase is considerable terminated successfully and the protocol can proceed to the next phase otherwise some problems can occur. A figure of this context is placed below:

JDBS - Ownership check response

The possible configurations or states that are cause of inconsistency in this phase can be:

The provided encrypted data packet doesn't match with the backup associated user's public key.
Some peers return a "not" response and someone else returns a "yes" one.

In the first case is possible that the user is not the real backup's owner and so will be not grant the download possibility otherwise can occur that the backup's integrity has not been preserved and unfortunately there is no way to recover that backup (probably should be gave the opportunity to delete such inconsistent backup to the user in the case of file corruption or just wait for the attainment of the backup's life cycle end making such backup obsolete).

In the second inconsistent context probably the backup on those peers that has responded with a ?not" response are corrupted. The user can download his backup from the rest of available nodes.

2.5.3.5 Download phase

Once the previous phases are been accomplished successfully the download phase from a user's selected peer can be performed and the backup can be saved in the local user's repository as shown in the following figure.

JDBS - Download phase

The only problems that can occur in this phase are linked to typical and well-known network's problems.

2.6 Use cases and scenarios

In this section is placed a set of use cases accompanied with an exhaustive set of application's scenarios that will helps the characterization of JDBS functionalities and system's functional requirements. The user is the principal human actor of JDBS either the JDBS itself plays a fundamental role as system actor.

2.6.1 Use Case - Backup artifact creation

The goal of the use case presented in the following figure is to show how a JDBS user can create a new backup artifact providing JDBS with a list of files that must be backupped (this will be the backups' data), a symmetric password (used to encrypt the backups' data) and an asymmetric private key (used to sign the backup artifact).

JDBS - Backup artifact creation

In the following paragraphs will be placed a complete set of scenarios that regards this particular use case.

2.6.1.1 File selection

Id	0001
Name	File selection
Description	This use case represents the selection of files that the user must perform on such files that must be part of the final backup artifact.
Actors	User
Preconditions	Both the User and JDBS must have access to the file system.
Principal Flow	The User once has chosen a sort of "New Backup Creation? is asked from JDBS to chose one or more files that will be part of such new backup. The User must select trough a file systems' view the set of files hew wants to backup.
Alternative Flows	Not present.
Post Conditions	JDBS has a valid set of selected files to work on during the backup creation.

2.6.1.2 Chose a symmetric password

Id	0002
Name	Chose a symmetric password
Description	After a valid set of files to backup has been chosen, JDBS will ask the User to provide a not empty password in order to encrypt such files creating for each one an encrypted file.
Actors	User
Preconditions	The User has chosen a set of files that must be part of the backup.
Principal Flow	The User is asked to insert a not empty password; Once it has been provided JDBS will start to encrypt each file.
Alternative Flows	The User has already provided JDBS with a password to encrypt backup artifacts and won't change it. JDBS in this case will skip that step.
Post Conditions	JDBS has a password to encrypt the backups' data.

2.6.1.3 Chose a private key

Id	0003
Name	Chose a private key
Description	Once a compressed backup artifact has been created from JDBS such backup must be signed with an asymmetric cryptographic signing approach. To do so the User must provide JDBS with his own private key choosing it from his key ring or just creating a new key-pair taking the private key from it.
Actors	User
Preconditions	The User has chosen a set of files that must be part of the backup.
Principal Flow	The User is asked to insert a not empty password; Once it has been provided JDBS will start to encrypt each file.
Alternative Flows	Not present.
Post Conditions	JDBS has created an encrypted version of the files to backup.

2.6.1.4 File encryption

Id	0004
Name	File encryption
Description	Once the User has chosen a valid set of files that must be backupped, JDBS will try to encrypt each of those files with a symmetric cryptographic approach using a user previously provided password.
Actors	JDBS
Preconditions	The User has chosen a set of files that must be part of the backup.The User has provided JDBS with a valid password.
Principal Flow	JDBS will encrypt each file with a symmetric approach using the password provided by the user.
Alternative Flows	Not present.
Post Conditions	JDBS has created a valid encrypted version of the files to backup and it can proceed to the compression phase.

2.6.1.5 File compression

Id	0005
Name	File compression
Description	Once the JDBS has created a valid set of encrypted files it can proceed to a compression of such files in order to create an unsigned backup artifact.
Actors	JDBS
Preconditions	JDBS has created a valid set of encrypted files.
Principal Flow	JDBS will compress each file creating a new, single file, backup artifact.
Alternative Flows	Not present.
Post Conditions	JDBS has created a compressed file that represents the backup artifact.

2.6.1.6 GUId assignment

Id	0006
Name	GUId assignment
Description	Once JDBS has created a single, compressed, backup file it must assign to such a backup a global unique identifier.
Actors	JDBS
Preconditions	JDBS has created a valid single and compressed backup artifact.
Principal Flow	JDBS will assign a global unique name to the previously created backup artifact.
Alternative Flows	Not present.
Post Conditions	JDBS has correctly assigned a global unique identifier to the backup artifact and can proceed to the asymmetric sign phase of such backup.

2.6.1.7 Sign the backup artifact

Id	0007
Name	Sign the backup artifact
Description	Once JDBS has created a single, compressed, backup file with a global unique identifier it must sign such backup with an asymmetric cryptographic approach in order to create the final backup artifact.
Actors	JDBS
Preconditions	JDBS has assigned to the backup artifact a global unique identifier.
Principal Flow	JDBS sign the backup artifact with a previously user's provided private key in order to grant data ownership and other security properties.
Alternative Flows	Not present.
Post Conditions	JDBS has correctly signed the backup artifact and can proceed to its storage in the local repository.

2.6.1.8 Local storage

Id	0008
Name	Local storage
Description	Once JDBS has created a signed backup artifact it must save such backup in the User's local repository.
Actors	JDBS
Preconditions	JDBS has created a signed backup artifact.A local repository has been defined up by the User.
Principal Flow	JDBS will move the signed backup from the working directory to the user local repository.
Alternative Flows	Not present.
Post Conditions	JDBS has correctly stored the backup artifact in the local repository and It can proceed if the User wants so to the distributed backup storage.

2.6.2 Use Case - Distributed backup storage

The role of this use case is to show how a backup previously stored in the user's local repository can be uploaded to other peers reaching the goal of distributed storage. This use case, as described in the scenarios that follow, is accessible only after a valid backup creation; such creation is described in the relative use case.

JDBS - Distributed backup storage

In the following paragraphs will be placed a complete set of scenarios that regards this particular use case.

2.6.2.1 Backup selection

Id	0009
Name	Backup selection
Description	Once a signed backup artifact is available in the user's local repository the user can try to distribute such backup on the JDBS network. To do so, the user must select from his local repository the backup he wants to distribute.
Actors	User
Preconditions	A signed backup artifact is available in the local repository.JDBS is working online or an Internet connection is available.Both the User and JDBS has to have access to the file system.
Principal Flow	Once the user has chosen a sort of "New backup distribution? he has to select the particular backup he wants export on other peers.
Alternative Flows	Not present.
Post Conditions	The user has provided JDBS trough a view on the file system a valid backup artifact.

2.6.2.2 Life cycle and replication degree definition

Id	0010
Name	Life cycle and replication degree definition
Description	Once a backup artifact has been chosen, the user must define for it a life cycle and a replication degree. The life cycle will define a network life for the artifact otherwise the replication degree defines the number of nodes on which the artifact must be uploaded.
Actors	User
Preconditions	A signed backup artifact has been selected from the user's local repository.
Principal Flow	The user when asked from JDBS must provide a life cycle in terms of hours, days or years. Otherwise even if it is asked by JDBS the user must provide an integer value that represents the replication degree for the artifact he wants to distribute.
Alternative Flows	Not present.
Post Conditions	The backup has associated a valid life cycle and a valid replication degree.

2.6.2.3 Neighbourhood search

Id	0011
Name	Neighbourhood search
Description	Once a backup artifact has been chosen and it has associated a valid life cycle and a valid replication degree the user must try to start a neighbourhood search looking up for JDBS peers. From the retrieved peer list the user will be able to select, according with the defined replication degree, a set of peers on which distribute the artifact.
Actors	User, JDBS
Preconditions
Principal Flow	The user will perform an action like "Search the Neighbourhood? once JDBS has finished such search will be available a list of peers on which the artifact can be distributed. The peer search is performed taking count also of the artifact dimension.
Alternative Flows	Not present.
Post Conditions	A fresh list of available peers has been retrieved from JDBS.

2.6.2.4 Peers selection

Id	0012
Name	Peers selection
Description	Once a list of peers has been retrieved the user can proceed to the selection of such peers where he wants to distribute his artifact.
Actors	User
Preconditions	A fresh list of peers is available to JDBS.
Principal Flow	The user will perform a selection of such peers taking care to choose peers that are most suitable in terms of parameters. The peers should be chosen with the minimum ping, with the nearest local position and the maximum available space, etc...
Alternative Flows	If the artifact replication degree is not reached by the length of peers selection JDBS will schedule a post-upload procedure waiting in background some suitable peer to upload the artifact to. This is a sort of automatic upload scheduling and it is computed and stored locally.
Post Conditions	A list of target peers has been chosen by the user and provided to JDBS or a background schedule has been created.

2.6.2.5 Backup upload

Id	00013
Name	Backup upload
Description	Once a list of peers on which can be uploaded the artifact has been chosen, the user must try to perform an action like "Start distribution? in order to upload his artifact on such chosen peer nodes.
Actors	User
Preconditions	A list of target peers has been provided to JDBS by the user peers selection or a background upload schedule has been created by JDBS.
Principal Flow	The user, once has started the upload process, will see that some upload starts in background or anyway will be signalled from JDBS about the actions it is performing.
Alternative Flows	Not present.
Post Conditions	The backup artifact has been uploaded on the target peers according with the chosen artifact replication degree.

2.6.3 Use Case - Distributed backup retrieval

This use case shows how a user can retrieve a copy of a backup from the JDBS network. As previously described in the paragraph "Backup retrieval protocol? must be satisfied some preconditions before a backup download can be started; the main prerequisite is the backup ownership.

JDBS - Distributed backup retrieval

In the following paragraphs will be placed a complete set of scenarios that regards this particular use case.

2.6.3.1 Search key definition

Id	00014
Name	Search key definition
Description	Io order to find some distributed backup stored in the JDBS the user must provide a searching key with which will be searched the backup.
Actors	User
Preconditions	Not present.
Principal Flow	The user once has performed a sort of "Retrieve a backup? must provide to JDBS of a searching key with which the backup can be found in the network of currently available peers.
Alternative Flows	Not present.
Post Conditions	JDBS has a searching key and can proceed to search for available network backup artifacts.

2.6.3.2 Neighborhood search

Id	0015
Name	Neighborhood search
Description	Once the user has provided a key to search for an online artifact and performed an action like "Search? JDBS will start searching in the neighbourhood for existing copy of the searched backup.
Actors	User, JDBS
Preconditions	The user has provided a key search for the backup he wants to retrieve.
Principal Flow	JDBS will try searching the neighbourhood looking for a backup that matches the provided searching key.
Alternative Flows	Not present.
Post Conditions	JDBS has found a set of peers with a local copy of the searched backup.

2.6.3.3 Provide a private key

Id	0016
Name	Provide a private key
Description	Once JDBS has found a valid set of peers on which the backup is stored it will ask to the user to provide the private key with which the backup has been signed previously during its creation in order to perform an ownership check.
Actors	User, JDBS
Preconditions	Not present.
Principal Flow	JDBS will ask the user to provide his private key in order to perform some actions like described in the "Backup retrieval protocol? section.
Alternative Flows	The user has provided JDBS with a default private key; in this case JDBS will try to use that.
Post Conditions	Once JDBS has checked the ownership with success it can precede to the download phase otherwise will warn the user about what is happened.

2.6.3.4 Backup download

Id	0017
Name	Backup download
Description	Once JDBS has found a valid set of peers from which download the user's artifact and once the ownership check has been performed successfully the backup download can be started. The backup will be placed in the user's local repository.
Actors	User, JDBS
Preconditions	The user has preformed a sort of "Start download?.JDBS has successfully checked the artifact ownership.
Principal Flow	JDBS will start downloading the artifact.
Alternative Flows	Not present.
Post Conditions	JDBS has successfully downloaded the artifact placing it in the user's local repository.

3.0 Analysis

In this section we will discuss the problems found during the requirements' analysis trying to find a valid pattern to approach them and finishing with a logic architecture proposal for the system. The role of such logic architecture will be a specification of a set of software layers and the relative base entities and sub-components that will be contained in each one and that plays a fundamental role in the system.

3.1 Problem Analysis

From the requirements analysis we can state that JDBS is a feasible system because is similar to most of p2p applications available nowadays. Because of its intrinsic possibility to be used as a stand-alone application or as a subsystem of InternetCafe and because that application provides a graphical user interface, JDBS must provide a GUI too. According to that we can state that such GUI has to provide a way to easily manage the business logic placed on the bottom. The business logic will provide a way to manage the domains' entities such as users, backups and network offering and making easily accessible all that functions and systems' behaviors evidenced in the requirements analysis. A possible layered view of JDBS is placed in the figure placed above:

JDBS - Two layers view

The role that the Graphical User Interface Layer will play in the system will present to the applications' user a compact systems' vision, otherwise, the Business Logic Layer will represent the systems' core offering all the systems' functionalities. The arrow placed between those layers refers to the possibility to have an layers' interaction; in fact, the GUI has to be updated when the business logic changes and the Business Logic must receive by the GUI such users' provided parameters and runtime configurations.

3.2 Structural Analysis

Stated that JDBS must be provided of a GUI that plays an architectural role as a presentation layer other layers can be found providing an uncoupling of the previously mentioned Business Logic Layer. The business layer can be divided into other layers:

Static layer whose role will be to provide access to entities with static and immutable behaviors or characteristics (user, backup and static components or utilities).
Execution (or Dynamic) layer that will represent the system execution offering a runtime support to the application interactions.
Network layer whose role will be the management and the control of the network behaviors (JDBS network vision, peer node management, connections).

A figure that represents the structural vision of JDBS is placed above, note that are not specified the layer interactions that will be explained extensively in another section:

JDBS - Layered view

In the above paragraphs will be explained the role that each layer plays in JDBS.

3.2.1 GUI Layer

As previously mentioned the GUI layer represents the application front-end. Trough the platform (Java) available entities (Frame, Panels, Buttons, etc...) it will present to the user a JDBS view making accessible functionalities provided by the layers placed above in the layers stack. Many system views or displayable functionalities are listed above:

Network view
- Neighborhood search facilities.
- Download and upload facilities.
- Peers and relative descriptions.
Local repository view
- Locally stored backups management.
Backup view
- Creation, deletion and management.
Configuration view
- Definition of symmetric and asymmetric keys.
- Local repository configurations (free space, local paths, etc...).
- User data definition.
Network view
- Neighborhood search facilities.
- Download and upload facilities.
- Peers and relative descriptions.

The given description of systems' views is not enough and doesn't cover many layers' aspects but it is needed to focalize main GUI sub-components.

3.2.2 Static Layer

The static layer will provide to JDBS those entities and sub-components whose characteristics and behaviors are static or immutable. In the following sub-paragraphs are placed some diagrams that shows domain entities and components.

3.2.2.1 Core entities

According with the requirements analysis we can evidence some entities like User, Backup, Repository, Symmetric and Asymmetric Keys, Key Ring, etc. A UML diagram that shows the relations between such entities is placed below:

JDBS - Core entities

Note that the entities that will take part in the network architecture will be described in the project paragraphs because are dependent from an external component (JXTA).

3.2.2.2 Core components

This paragraph presents some static components or utilities and for each one is given a diagram or a set of interfaces that they must implement:

Symmetric cipher
Asymmetric cipher
Compression
Global Unique Identifier Generator

Such mentioned components will take part in the JDBS architecture offering the core functionalities, the specification of network behaviors and functionalities will be given in the project paragraphs.

3.2.3 Execution Layer

Because of the fact that JDBS has to support backup downloads and uploads as well as other functionalities (such as compression or cryptographic facilities) a problem will be to find a valid solution in the business layer to support such requirement; we know that each action will comport an intensive resource usage (just think at downloads, uploads or data compressions). To avoid that the application stops its execution because everything is running in a single execution entity JDBS must provide at this layer a component that supports a multi-execution facility. The main role of such execution component will be offer a runtime support such as synchronous or asynchronous execution unit taking care to make any of those units replicable. Another feature that must be implemented is the execution tracking in order to support undo, redo and stop operations on the various execution units.

This layer will be well described in the project step here is placed an interface that represents the runtime executor:

JDBS - Executor interface

3.2.4 Network Layer

As described when we talked about the networks? protocols its possible to use a third party middleware or software layer to avoid a reengineering of a peer-to-peer base layer. At the network layer JDBS must support and solve some of the context typical problems like peers discovery, point-to-point communication and data exchange between networks? nodes etc. supporting and implementing at the application level strategies rather than base functional aspects will lead to a smarter application deploy and to a reduction of development timings. When we will go to project the JDBS network we will probably adopt JXTA as start base to implement the JDBS network vision, in fact, it supports group management (creation, join and leave), network discovery, node vision (in descriptive sense) and much more already done p2ps? network functionalities. A conceptual diagram of JXTA vision is placed above:

JDBS - JXTA

3.3 Behavioral Analysis

Rather than other functional behaviors here must be described the JDBS support to the user during the download and upload actions. As mentioned in the requirements analysis (scenario 00012) the system must integrate a component that manages the scheduling of network operations placing and executing them in background. At this time must be thought the impact of a sub-component like that because it introduces a sort of automation that is a systems' behavior. The network scheduler (as it can be named) must provide at fixed interval rate a network exploration and must perform planned downloads or uploads of backup artifacts according with the user requests and configurations. To do so an automatic backups' search engine must be provided and it must be coordinated with the scheduler in order to reach the scheduled goals.

3.4 Interaction Analysis

In this section will be explained the users' interactions and the system inner-interactions between software layers. First of all for us interaction means a way to provide an input and wait for an output, such an output can be also hidden by the system. As described in the use cases presented in the requirements analysis the user will interact with JDBS trough the UI. The UI will provide the facilities described in the previous paragraphs and it should be managed according to a sort of pattern MVC (Model View Controller). The classical schema relative to MVC is presented below:

JDBS - Model View Controller

The presented diagram means that the User will interact with the View that takes data from the Model and it is updated from the Controller. The Controller itself receives solicitations from the View trough the inputs provided by the user and providing updates to the Model. According to that we must define what layers matches such abstract representation. Its obvious that the View represents the GUI layer otherwise the Controller is the Execution layer and the Model is relative to both the static and network layers. To provide this kind of interaction must be thought a data format or an interchange format with which the various layers will communicate; this aspect will be part of the project and will be extensively described there.

3.5 Risks Analysis

4.0 Logic Architecture

The logic system architecture refers to the vision of packages, roles and entities found during the analysis process. In the following paragraphs will be placed figures and conceptual diagrams that represents the components that will cited or used more and more in the project phase. Must be clear that the above diagrams are only conceptual visions of the system; trough those visions the software analyst wants to make evidenced some properties and express a possible architecture on how start the project phases. As well known the software architectures are many and many but sometimes use a architecture rather than another will lead to a better final solution.

Here is placed a diagram that represents the abstract architectural package view for JDBS:

JDBS - Abstract packaged view

Note the semantic of colors associated to each package, it will be used from now on when we will refer to domain elements. The dependence relations presented in that diagram refers to "usage" relations that intercourse between packages.

4.1 Detailed architectural view

A abstract but more reach architectural view of JDBS is placed in the above figure; in the following paragraphs will be better-defined trough a consecutive and incremental specification of functionalities and individuated roles a more detailed representation of each software component:

Documentation

Resources

Source Code

Reports

Search InternetCafe

Java Distributed Backup System - Index of contents

0.0 Vision

1.0 Requirements

1.1 Independence from the platform

1.2 Sharing disk's space availability between users

1.3 Secure data access and storage

1.4 Smart data upload and retrieval

1.5 An uncoupled and easy extendible component

2.0 Requirements Analysis

2.1 Functional requirements' analysis

2.2 Not functional requirements' analysis

2.3 Glossary

2.4 JDBS Offline context

2.4.1 Encrypt backup content with a symmetric cryptographic approach

2.4.2 Signing backup artifacts with an asymmetric cryptographic approach

2.4.3 File compression utility

2.4.4 Global unique identifier generator

2.5 JDBS Online context

2.5.1 Peers' discovery protocol

2.5.2 Backup storage on peers protocol

2.5.2.1 Free space request phase

2.5.2.2 Free space response phase

2.5.2.3 Upload phase

2.5.3 Backup retrieval protocol

2.5.3.1 Lookup request phase

2.5.3.2 Lookup response phase

2.5.3.3 Ownership check request phase

2.5.3.4 Ownership check response phase

2.5.3.5 Download phase

2.6 Use cases and scenarios

2.6.1 Use Case - Backup artifact creation

2.6.2 Use Case - Distributed backup storage

2.6.3 Use Case - Distributed backup retrieval

3.0 Analysis

3.1 Problem Analysis

3.2 Structural Analysis

3.2.1 GUI Layer

3.2.2 Static Layer

3.2.3 Execution Layer

3.2.4 Network Layer

3.3 Behavioral Analysis

3.4 Interaction Analysis

3.5 Risks Analysis

4.0 Logic Architecture

4.1 Detailed architectural view

5.0 Project

5.1 Projects choices

5.2 Patterns

5.3 System structure

5.4 System behaviours

5.5 System interactions

6.0 Architecture

7.0 Conclusions