Background and Project
Requirements
Note that the programming assignments are built around the
BestPeer P2P project, which is in the re-implementation phase. The codes that you develop must therefore work
with the existing system. You can work
in a group of 2-3 persons, and projects will be allocated on the first come
first serve basis. If you have any
questions or problems regarding the implementation, you are welcome to consult Dr Xu Linhao (
If you program the project with JAVA, please pay attention to the followings:
1. All rules mentioned in [1] and [2] must be followed strictly, and all algorithms must be described in the report and comments.
2. All methods/variables must be commented.
3. If you need help on how to write codes properly, please unzip src.zip file in the Java SDK directory and read the source codes.
4. Please use Java SDK 1.5 to develop the project.
If you program the project with C/C++, please pay attention to the followings:
1.
All rules (naming convention and coding style) mentioned
in [4] must be followed strictly, and all algorithms must be described in the
report and comments.
2.
All methods/variables must be commented.
3.
It is recommended to use the STL (Standard
Template Library).
Report that explains the design of the algorithms, and flow of the program must be submitted together with the program. The assignment is due by 15th November 2006. Total mark: 40%. Marks will be given for good program structure, efficient codes, robustness of the program, innovation and documentation/report.
Programming
Assignments
1.
Name |
Instant Message Module |
Goal |
Providing the facility between remote users (i.e., client peers) to exchange messages and upload/download files |
Precondition |
Relatively
independent of the current version of the Best-Peer project; the basic socket
communication framework in the current version can be used but should take
place “java.net.Socket” with “java.net.DatagramSocket” |
Requirements |
(1) The
network bandwidth at each peer side should be used efficiently and the
implementation should be lightweight;
(2) Interface
must be well defined with detailed comments; (3) Separating
GUI from underlying operation, i.e., using MCV model. |
Specification |
(1) User & Group Management: register
a user with his email account; remove a user from system; create a user
group; add a user to a group; remove a user from a user group; (2) Message Communication: Send a message
to a user; receive messages from other users (use UDP connection here); store
all received messages of a remote user into a file locally and if the file
size exceeds a predefined-threshold, create a new one for recording new
messages (this can be implemented by “java.util.logging”); (3) Data Transfer: allow users to upload/download
files to/from remote users (use TCP connection here); the received file is
stored in the predetermined directory; allow users to specify what kind of
file types he wants to accept, and filter all insecure file types during file
transferring (4) Bootstrap: A bootstrap server should
be built first to maintain all online peers (this can use existing code in
the Best-Peer project), but different USER_ACCOUNT relation table should be
maintained by a database (e.g., MySQL) for evaluating the validness of any
user; (5) Online Status Monitor: the bootstrap
server needs to monitor the online status of each peer periodically, and
perform the corresponding operations: peer
joining – when a peer joins the network, the bootstrap server needs to
update all online peers, which have the friend relationship with it, for this
event and then all those online peers will update the online status of the
peer. At the same time, the bootstrap server needs also tell the peer which
friend peers are online; peer leaving
– when a peer leaves the network, the bootstrap server should notify all
friend peers to update the online status of the peer; peer failure – if the bootstrap server sends PING request to a
peer several times but without any response from it, the bootstrap server
will regard the peer failure and then notify all friend peers about this
event; bootstrap server failure –
when the bootstrap server is down, it will notify all peers to depart from
the system; or if any peer sends PING request to the bootstrap server several times but without
response, it will regards the bootstrap server failure and then leave the
network by sending LEAVE request to all online friend peers; |
Package Name |
sg.edu.nus.im |
Memo |
Google Talk and MSN are two good running examples |
2.
Name |
Network Simulation Module |
Goal |
Providing the facility to create the new instances of bootstrap server, super peer and client peer, and allow them to enter and leave the system with some well-known probability distribution |
Precondition |
Relatively
independent of the current version of the Best-Peer project; before
programming, should understand the basic interface of how to new client peer,
super peer, bootstrap server and the message type defined in class
“sg.edu.nus.protocol.MsgType”; |
Requirements |
(1) The
implementation must use event-driven
model and should be lightweight; (2) Interface
must be well defined with detailed comments; |
Specification |
(1) Defining events: use Factory pattern
(mentioned in [4]) to create network events for the simulator; the
pre-defined network event (needs to be implemented) includes events for creating the instances of different
peers, events for peers joining, leaving, events for sharing and un-sharing
local files, events for creating keywords queries; (2) Event generator: generate network
events with different distribution, such as Gaussian distribution,
exponential distribution, Poisson distribution, and put all events into a Priority Queue (study how to use this data structure
to simulate events); (3) Simulator: get network events from the
Priority Queue and execute the corresponding operation by invoking the
current APIs (may be not suitable for invoking but can do a simple simulation
if necessary); “java.lang.reflect” should be used for generating new instance
of peers; (4) Visualization: provide a GUI panel to
display the dynamic behavior of the peer instances; (5) Logging System status: for each step, all
information (must include error and exception messages) must be recorded into
a log file (using “java.util.logging”; |
Package Name |
sg.edu.nus.simulate |
3.
Name |
Online Peer Monitor Module |
Goal |
Providing the facility monitoring the behavior of all peers, including peers joining, departure and failure |
Precondition |
Dependent
on the current version of the Best-Peer project; the basic socket
communication framework in the current version can be used but should take
place “java.net.Socket” with “java.net.DatagramSocket”; The GUI is ready and
can be used directly; |
Requirements |
(1) The network bandwidth at each peer side should be used
efficiently and the implementation should be lightweight; (2) The interface used to invoke monitor operation should
be well defined; |
Specification |
(1) Peer departure: detect the “super peer
departure” event, and then update the necessary information on both Bootstrap
Server GUI and Super Peer GUI; detect the “client peer departure” event, and then
update the necessary information on the Super Peer GUI; detect the “bootstrap
server departure” event, and then broadcast this event to all super peers for
their departure behaviors. At the same time, for each super peer, it must force
its client peers to leave the network; (2) Peer failure: detect the “super peer
failure” event by polling PING message to each super peer periodically, and
then update the necessary information on both the GUIs of both bootstrap server
and super peer and invoke “stabilization” operation (interface is defined and
can be invoked directly) at the super peer side; detect the “client peer
failure” event by polling PING message to each client peer periodically, and
then update the necessary information on the super peer’s GUI; detect the
“bootstrap server failure” event by sending PING message to the bootstrap
server periodically, and then notify all super peers and their client peers
to leave the network; (3) Peer join: detect the “client peer
join” event, and then update the necessary information on the Super Peer GUI;
detect the “super peer join” event, and then update the necessary information
on both Bootstrap Server GUI and Super Peer GUI; |
Package Name |
sg.edu.nus.monitor |
Memo |
Similar to how to monitor peer’s online status in MSN and
Google Talk |
4.
Name |
Bootstrap Synchronization Module |
Goal |
Providing the facility for bootstrap servers to synchronize the online super peer list with one another, and therefore each client peer or super peer can join the network by using one of bootstrap server; this makes the system more robust and reduce burden at bootstrap side; |
Precondition |
Relatively
independent of the current version of the Best-Peer project; the basic socket
communication framework can be used |
Requirements |
The network
bandwidth at each bootstrap server side should be used efficiently and the
implementation should be lightweight; |
Specification |
(1) Information maintained by bootstrap server:
each bootstrap server must maintain two list: one for all online super peers
and the other for some online bootstrap servers (see bootstrap organization); (2) Bootstrap organization: All bootstrap
server in the system should be organized as a small tapestry system, and a
bootstrap server is assumed to be there forever as the root bootstrap; if the
root bootstrap server is down, then a new one should be selected out to act
as a new root bootstrap server; (3) Add new bootstrap server: a new
bootstrap server can join the system by accessing an existing one, and then copy
the super peer list from it; (4) Detecting bootstrap server status:
each bootstrap server needs to monitor all other bootstrap server’s online
status and updates the online bootstrap server list; (5) Communication between bootstrap servers:
the communication among all bootstrap servers are employed tapestry-liked
multicast (prefix-based multicast); |
Package Name |
sg.edu.nus.btsync |
5.
Name |
IR-based Schema Matching Module |
Goal |
Providing the facility to discover the matched relation database with the shared schema; returning the matched schema to users for further query processing |
Precondition |
(1) A
client user can share his local small database by exporting the schema to
other users; a schema management module at each client peer side is
responsible for maintaining all shared schema; and a schema management module
at each super peer side is responsible for maintaining all shared schemas by
all client peers belonging to it; (2) Each
peer has installed a relational database; MYSQL is recommended; |
Requirements |
(1) Constructing
the classification according to any existing methods, e.g. Bayesian
classifier or C 4.5; (2) Creating Schema mapping among the schemas in the same category; |
Specification |
(1) Forming Category: classify the sample schemas according to existing
methods (Bayesian classifier or C 4.5), and construct the hierarchical
classification, like Yahoo; (2) Category organization: The categories
in the hierarchical classification is maintained by one super peer,
containing its classified rules, its parent category, first child category,
next sibling category, and thus it can find any other categories; (3) Building schema mapping: Creating the
schema mapping among the schemas in the same category, that is, which
relation maps which other relation, and which attributes map which other
attributes. The local peer must maintain the mapping information, so that,
the peer can reformulate a query according to them, while super peer maintain
which schema is classified into certain category. (4) Add schema into category: for a new
schema, classified it into one of the existing category, or create a new
category if there is no category related to it. (5) Query reformulation: for a new query
from any peer, the posed peer reformulates it according to its schema mapping
information, and thus its relevant peer can understand and thus answer it. (6) Query processing: when a peer receives
a reformulate query, it reformulates the query according to its own schema,
answers the reformulated query, and finally returns the answer to the posed
peer. (7) Result visualization: the posed peer integrates all the answers from its relevant peers and visualizes the results; |
Package Name |
sg.edu.nus.sm |
6.
Name |
Relation Data Management Module |
Goal |
Providing the facility for client peers to share schema of local small database to other peers, including all interfaces to operate relation database when insert, delete, update data and perform IR-based retrieval |
Precondition |
(1) A
client user has installed a relation database, e.g., MySQL, at the client
peer side; (2) The
current version of the Best-Peer project; |
Requirements |
The interface used to export schema should be well defined; |
Specification |
(1) Displaying schema: access the backend
database system and visualize all fields and their necessary information
(e.g., data type) to users; (2) Selecting fields to be exported: users
can determine which schema and which fields to be shared by toggling
corresponding item, based on the visualization of the database; (3) Meta-data management: all meta-data of
the shared table and fields shared should be stored in a system-defined file
with system specified data structure; (4) Transferring exported schema to super peer:
upload all meta-data of the shared schema and fields to super peer; super
peer then indexes the meta-data with an inverted index whose structure is
<Field, Data Type, Description, Table, Peer ID>. Notice that the BATON
protocol will relay some meta-data to the corresponding super peers in terms
of “Field” value and this is implemented by BATON; (5) Processing SQL with exported schema:
given a simple SQL query, the client peer first executes the query locally
and then sends the query to super peer. After the super peer receives the
query, it will use the index with relation schema to determine which schema
may contain the desired data for the query. If having match, then the schema
information will be passed to the client peer for displaying. According to
the returned schema, the user determine which tables (fields) should be
queried and what kind of operation will be executed (e.g., join); Finally,
the query will be sent to the corresponding peers and the results will be
returned to visualize; (6) Data Management: If user determines to
catch the query results, all results should be cached to a temporary table
with the query schema. The content of the temporary table is not shared with
other peers; All shared relation data at local side cannot be updated or deleted
by remote users; |
Package Name |
sg.edu.nus.db |
7.
Name |
Index Management Module |
Goal |
Providing the facility for super peers to store and retrieval the index at the super peer side; the index is organized as the inverted index; the content to be indexed are coming from all shared data from all client peers that belong to the corresponding super peer |
Precondition |
(1) The
local files are indexed by Lucene; (2) The
current version of the Best-Peer project; |
Requirements |
(1) Synchronization is very important for this part. So the
code must be well design for this requirement with lock if necessary; (2) The interfaces used to manage index should be well
defined; |
Specification |
(1) Partitioning index range: giving an
option to user to specify how many index terms will be stored in a separated
index partition; in other words, to improve the system performance and reduce
the lock time on the index, we suggest to divide the whole index range into
several non-overlap segments (for example, [a-c], [c-h], [h-q] and [r-z]) and
each segment is stored in a separated file; (2) Insert index items: when a super peer
receives new index items from its client peer, it will open the corresponding
index partition, insert the index items and finally force the change to disk; (3) Update index items: when a client peer
updates its local index, the new index items will be uploaded to its super
peer. The super peer should open the corresponding index partition, update
old items by new ones; (4) Remove index items: when a client peer
deletes index items, the super peer’s index should be updated simultaneously; (5) Load index: when processing keywords
query, the corresponding index partition should be loaded from disk to
memory, if necessary. System should provide the interface for loading all
index partition and for loading a subset of index partition according keywords
set; |
Package Name |
sg.edu.nus.store |
Memo |
Refer to the source code of Lucene 1.9.1 (at site of
www.apache.org) |
Reference
1.
Sun
Microsystem. Code Conventions for the Java Programming Language. http://java.sun.com/docs/codeconv/,
1999.
2.
Sun
Microsystem. How to Write Doc Comments for the Javadoc Tool. http://java.sun.com/j2se/javadoc/writingdoccomments/,
2000.
3.
Erich
Gamma, Richard Helm, Ralph Johnson, and John Vlissides. Design Patterns:
Elements of Reusable Object-Oriented Software. Addison Wesley Inc., 1994.
4.
Diomidis
Spinellis. Code