Individual Report DISTRIBUTED SYSTEM An Assignment for Operating System module, in the University of Central England By: Harry Sufehmi, student-id #98799231 Introduction ------------ A distributed system is defined by its three characteristics [Mullender, 2] : 1. Consisted of multiple computers 2. Interconnected 3. Have a shared state Distributed.net (http://distributed.net) is founded by Adam L. Beberg, a graduate from Illinois Institute of Technology with focus on Operating System. Distributed.net is a non- profit academic organisation committed to serve as a gathering point for topics relating to distributed computing, or the process by which countless computers work together toward solving a particular problem. It is through the application of this concept that distributed.net has been able to develop and refine these techniques, improving on the range, scope, and variety of tasks which are suitable for this technology. Most distributed system are not very big in size. NFS for example, sacrifice its scalability in return for more transparent access capability. This is because the server and client are so closely linked, almost every action requires communication between server and client, despite some cache schemes implemented. Distributed.net on the other hand, seems to follow these principles obediently [Mullender, 375] : 1. Clients have the cycles to burn: Most of the job is done by the clients. This greatly enhances scalability of the system, since adding clients won�t place a heavy burden to the server�s resources. 2. Minimise system-wide knowledge and change: Distributed.net doesn�t make all the clients aware of the details of the system status - only the most important ones. For example,when the job is switched from RC5-64 cracking to DES-II cracking. The client would recognise the tiny signal that�s enveloped in the packet streams when they�re connected, and will automatically switch job. 3. Trust the fewest possible entities: The clients don�t bother with other clients, they�re just concerned with the keyserver they�re connected into. Therefore,Internet connection (unsecured link) is sufficient, the data is just encrypted in the both end. 4. Cache whenever possible: The amount of cache is configurable. For computers that�s connected 24x7 hours to the Internet or a slow computer, the amount is suggested to be low so when the key is cracked it would be able to quickly notify the keyserver. But in contrast, a computer with non-24x7 or unreliable connection, and also a very fast computer; is suggested to build-up a huge amount of cache.Therefore,disconnection of communication with keyserver for a long periods of time won�t disrupt the client work�s continuity. For the fast computer, a lot of cache ensure that the traffic would be low, reducing the possibility of overwhelming the keyserver. 5. Batch whenever possible: The only connection made between keyserver and the client is when the client is downloading new blocks to be cracked, or when it submitted the blocks that have been cracked. It doesn�t requires constant communication. Moreover, sometimes a proxy server could be inserted between client and the keyserver. The proxy server would serve a lot of clients. Then only when it reached its threshold,it would download/upload the blocks to the keyserver,further reducing the number of communication to/from the keyserver. The result from following these principles is a system that has reached a new level of scalability. Volunteer participation in distributed.net is estimated at over 20,000 individuals from nearly every nation and region. With combined resources of as many as 100,000 computers, distributed.net easily represents the largest academic collaborative computing effort ever undertaken. Discussion ---------- Tanenbaum in his book agrees that it�s quite difficult to pinpoint the definition of �operating system� [Tanenbaum, 3]. He agrees on several definitions, including: 1. Extended / Virtual machine 2. Resource manager Upon these definitions, it�s very easy to see that the distributed.net�s software falls under the category of operating system, instead of just an application. It�s as described below: 1. Virtual machine: The distributed.net project connects many computers around the world, and sums up all of their computing power up by making them all to process similar task. This way, they could be thought as a single huge machine, although in reality they�re scattered all around the earth. 2. Resource manager: The distributed.net software also have the capability to manage (albeit in somehow limited way) and utilise the resources in the clients, including memory, disk space, network, and especially the processor. In short, now the definition of �operating system� has started to be truly �to enable to operate a system� instead of just �to enable to operate a type of computer hardware�. Due to constrain in space, I�ll just do a quick summary for the following. The distributed.net�s software is available for almost all major operating system. Example is Windows 3.1, Windows95, WindowsNT, Linux, Sun Solaris, AIX, BeOS, IBM OS/390, Rhapsody, VMS, Digital Unix, Amiga, OS/2, etceteras. The client is designed to be small and efficient, also able to be completely hidden from the view of the user. In its the default operating mode, �very nice�, the client would consume 100% of CPU cycle when the computer is not used; but will instantly release the CPU at the first signal of user access to the computer. So the user would not even notice the differences at all. To deter viruses, they use rc5.distributed.net as the only download site. Also all of the programmers is trusted, and they all use a common code. For security, the packets is sent in encrypted form, using a simple algorithm. In the future, the v3 client (currently used is the v2 client) would use an enhanced encryption key. They�re so confident in it that they even promises to release the source code. This is a good thing, especially for the academic community. Current projects of distributed.net is to crack RSA�s RC5 64- bit crypto key. This is a massive task to be accomplished, at maximum will require 1.845x10(19) keys to be checked. Past projects that has been done successfully is cracking the RSA�s 56-bit encryption (at maximum require 72 quadrillion keys to be checked), and DES II-1 encryption (in only 40 days). This all also signals to the government that current standard encryption (example; DES is USA government�s standard) is not enough anymore to secure the data, so that even a loosely-organised volunteers could crack it - let alone a determined foe with massive resources. Future projects includes, but not limited to, finding the ideal Gollumn�s ruler, finding the biggest prime number, etceteras. Conclusions ----------- It�s breathtaking to imagine bonding together this many computers, with this many operating systems, with so many hardware platforms, from so many nations, in so many time zones and put them all to work on a common computing task. Yet it could be done, has been done, and still expanding� As a crude comparison, by 24 February 1998, the total computing power of distributed.net�s volunteers is equivalent to 15,316 Sun Ultra 1. Of course, this is no longer correct at the present, since member base is growing very fast, by daily basis. I think the ultimate goal to be reached is to enable transparent distributed system. Imagine when a user in a laboratory is about to do a task that would normally incur very heavy load for the processor - like running a complex simulation. All that he needs to do is to run the distributed server software, and mark the simulation program�s pid with a special flag. The distributed server program would notice this, and automatically spreads out the task to other workstations on the lab running the distributed client software, probably on different OS but still in the same hardware platform. The client software would then execute the task in a separate virtual space, taking out whatever idle CPU time that�s available. . The condition that must be met to realise that are: 1. The software must be coded to utilise a multiprocessor system according to the OS� standard, or 2. Have its most processor-intensive routines spread into several threads that could be recognised by the distributed server software. 3. The distributed server software must have hooks to the OS�s kernel. Only then the distributed server software could grab all of the threads, and run them on the network instead. Requirements number 1 and 2 is probably would be considered uncomfortable, but yet it�s unavoidable. This is because code execution in the processor level is sequential, so there�s no way to spread it out automatically, even when we tries to accomplish this in the kernel itself. It could only be done when the task-division is done in the application software level. This way, there�s no need to buy many expensive servers for the task, and set it up for clustering configuration using some high-end Unix OS. All we need is some clone PCs connected via a network, then the distributed server software would automatically spreads out task set by the user(s) to the PCs, effectively utilising every idle processor time that instead would be wasted for nothing. (update: MOSIX is a project towards this goal, and I didn't even know of their existence when first writing this report! Come visit their website at http://www.mosix.cs.huji.ac.il) For the time being though, the distributed.net�s RC5-64 project is already a breakthrough academic research in distributed computing. I highly advise University of Central England to support this cause, especially to fully exploit the idle processing time of the Sun workstations in the labs. I shall be glad to assist in anyway I can. Reference List 1. Mullender, Sape (1993)(Ed.), Distributed Systems, New York, Addison-Wesley 2. Charles Hubbard (1998), RC5-64: Project Bovine FAQ, http://www.distributed.net/FAQ/rc564faq.htm, Distributed Computing Technologies, Inc. 3. Tanenbaum, Andrew S (1987), Operating Systems - design and implementation, New Jersey, Prentice-Hall. 4. David McNett (1998), Press Information, http://www.distributed.net/pressroom/presskit.htm, Distributed Computing Technologies, Inc. 5. Materials from an interview with Adam L. Beberg ([email protected]), founder of Distributed.Net. Used with his written permission, including for academic purpose and public viewing in the Internet. _____________________________________________________________ Last modified: 27 February 1999