Sunday, April 5, 2015

Is my FreeNAS server overpowered or underpowered or just about right ?

I recently tested out CIFS share performnace on FreeNAS 9.3 (version FreeNAS-9.3-STABLE-201503270027 which is latest as on April 5, 2015) PC built using following parts:

Intel Core i3-2100T – Dual Core, 2.5 Ghz Clock speed
Asus P8H77i Motherboard with 6 SATA ports
1600 Mhz DDR3 RAM
320 GB Hitachi 2.5” HDD (just for test)
1 GbE port


The test topology is:


                                    GbE                                   GbE
FreeNAS Server -----------------> Router ----------------------> Windows 7 Client PC
                                   Cat 6                                    Cat 6
(Samba Server )                                                                               (LANTest)
(Iperf Server)                                                                                   (iPerf Client)
(DiskInfo -t /dev/ada0)


The results are described below:

(1) iPerf shows 76-85 MB/s with a CPU usage between 10-15% (max limit is 200% since its dual Core CPU). I would take this as the practical max that I could extract from my network with my setup. The max theoretical transfer rate on my test setup is 125 MB/s since I have one GbE port only on the motherboard
(2) LANTest transfer performance is 74-76 MB/s write and 64-66 MB/s read with a CPU usage of around 25-28%.
(3) If we measure transfer rate in Windows copy with the a large 5.3 GB file, I get a read rate of 57 MB/s and write rate of 47 MB/s with around 15-18% CPU in case of read from network server and 20-25% in case of write to network file server
(4) The diskinfo benchmark test reports Transfer performance between 28MB/s to 62 MB/s based on where the data is on the physical disk medium (outer/middle/inner areas)



A few surprising things:
(a) Using tools like LANtest, Crystal Disk Mark, NAS performance Tester, etc, I am getting write performance higher than read performance as opposed to real world windows file copy-paste which throws up a more reasonable sounding outcome. I have seen such reports elsewhere on the Internet too.
(b) In my case, LANTest reports higher performance than what is observed in real world copy-paste.

I did try using some standard samba optimizations  (see bottom of this post) set through auxiliary parameters in FreeNAS CIFS GUI configuration but none of them had any noticeable positive impact on throughput. Perhaps there might be was a very marginal degradation (1 MB/s)

On a little investigation, i found that SAMBA server (smbd daemon) does not seem multi-threaded but follows a multi-process design, with one process per network client. In my test topology, their is one client only & smbd transfer is using only 1/8th of the available CPU horsepower (200% max). The bottleneck which is looming on the horizon is the singular GbE Network interface.

I do not expect the results of AFS and NFS shares to drastically alter these observations regarding the bottleneck.


Suggestion for Home Users:
So for most consumer grade applications which may require max 2-3 parallel samba transfers in worst case (and 1 typically), this FreeNAS setup of mine seems overpowered. Stepping down to a single core CPU or multi-core ATOM might be a workable option in the x86 architecture. Also it opens a window of opportunity for multicore ARM based SBCs (especially with SATA ports over those with only multiple USBs) especially where no RAID is required and one disk is sufficient . 

Its also worthwhile noting that many home users do not need very high transfers (100 MB/s or so). They are fine if a copy (read write) just works at 10-20 MB/s which is sufficient for downloads, 1080p video streaming (though 4K would be a challenge) and do not move big files around or use a network drive as a replacement for local storage. I am one of them most (maybe almost all) of the time. Also Fast ethernet 10/100 Mbps should be avoided.


Suggestions for SOHO or small enterprise use:
For office deployment, you need more parallel transfers and therefore its worthwhile giving this atleast a Dual GbE PCI NIC or better a quad GbE NIC upgrade and possibly used with link aggregation (might require a better router/switch which supports this feature). You could also select a motherboard with 2 GbE to start things out.

So while building your own FreeNAS file server a little pit of research on Internet  regarding the speeds achieved with different CPUs


Reference: Samba (CIFS) tuning options that I tried but didn't work any wonders for me.
aio read size = 16384
aio write size = 16384
read raw = yes
write raw = yes
socket options = TCP_NODELAY IPTOS_LOWDELAY SO_KEEPALIVE SO_RCVBUF=131072 SO_SNDBUF=131072 IPTOS_THROUGHPUT
use sendfile = true

Sunday, March 15, 2015

Not all MicroSD cards are equal

I have a bunch of credit card size single board computers tp play around like the Raspberry Pi, Beaglebone black, Odroid-C1 and Banana PRO. One item that I have to procure with every SBC is the Flash storage for the OS and applications i.e. the Micro SDHC card. It becomes critical to bootup time and run-time operation as Linux depends heavily on the disk access.

Bit I just discovered that not all cards are made equal. Two reputed brands available in India are Sandisk (Sandisk Ultra Class 10 UHS-1) & Kingston (8 GB class 10 UHS-1). I compared the performance of the SD cards on the Beaglebone Black and the PC (with Bitfenix USB 3.0 internal card reader, Core i7-2600K, 8 GB RAM) and here are are my observations:


  1. Sandisk  card gave a sustained performance of about 19.2 MB/s for sequential reads on the beaglebone Black and 22 MB/s on a PC
  2. Kingston card gave a sustained performance of about 12.5 MB/s read for sequential reads on a beaglebone Black
  3. A Samsung 840 EVO SSD can give a sequential read performance of 66 MB/s
  4. A Western Digital RE black enterprise drive can give a sequential read performance of about 95 MB/s


The Test was done using a simple command:

sudo hdparm -t [flash-device-Name] 

So my first conclusion is that not all SD cards are equal. Though the kingston and Sandisk cards are both rated class 10 and UHS-1, the sandisk one is about 50% faster than the Kingston. I found that many buyers on amazon have also complained on relatively slow real life performance of kingston cards in their smartphones. So I would recommend that SBC buyers faced with a choice nbetween these two cards opt for the Sandisk make.

Secondly the expect a performance drop of about 15% when moving an SD card for PC to any SBC. This is not so significant.

And Thirdly if you are moving an I/O intensive (bound) application from Linux PC to Linux SBC, then expect a 3-5 times drop in I/O performance (considering you are using the better sandisk cards). Of course the beaglebone black and Raspberry Pi can generate data at no more than theoretical 12.5 MB/s on the Ethernet port but it could generate more faster internally. But the Odroid-C1 and Banana-PRO have GbE interfaces that theoretically can receive 125 MB/s and for them a slow card will only restrict their capability

Also, its worth noting that when I connected an old 2.5" Hitachi drive (model 5K320-160) to a Banana PRO SBC, using the board's SATA interface,  the hdparm test gave a result of 52 MB/s. Possibly a USB 2.0 connected drive may give around 30 MB/s while an SSD connected to SATA may give 90-100 MB/s. So my final conclusion about storage mediums speeds on SBC are:

SATA SSD > SATA HDD > USB SSD/HDD > Flash Storage

Many boards do not feature SATA and therefore Flash and USB SSD/HDD are the only available options. Most will not even boot USB drive easily.

Tuesday, February 17, 2015

Early signs of changing winds in the Personal Computing Industry

Its 2015. I see winds of change for the desktop computing platform. The mobile computing platform (ARM based) is continuing with its stranglehold in the mobile computing pace while Intel & AMD continue to operate in a relatively declining desktop consumer market. But what is not obvious immediately is the quiet entry of ARM based devices as entry level PC hopefuls just as Intel and others are trying to shrink the PC with their MiniPC and NUC efforts. 

Today smartphones are coming out with 2 Ghz+ Quad core or 8-core devices with the ARM architecture, coupled with 4 GB RAM. While by no stretch of imagination these can beat Intel or AMD CPUs in raw performance, the fact of matter is that many users DO NOT need all the x86 compute power, atleast NOT ALL the time.  You can see this with the prevalence of thin clients in office environments (large & small). Even at the consumer end, if you measure the average CPU usage of your PC over time it might be less than 10% with a typical peak between 30-50%. So lets do a rough comparison of an Intel based x86 running an entry level Core i3 m/c and  raspberry Pi 2 Single Board computer running an ARMv7 chip with 1 GB slow RAM and flash storage.

  1. CAPEX - The raspberry pi m/c (minus LED screen, keyboard and mouse but loaded with Wifi, case, PSU, Fast microSDHC card)  will be around $60. The PC (cabinet+PSU+Mobo+CPU) will be around $350 without peripherals. That's a 6 times difference and you will get only 2 cores in the Intel PC Setup. For sake of simplification i assume we run a linux distro (like Ubuntu) on both.
  2. OPEX - The PC's OPEX is power. A raspberry pi uses about 3-4W (10W is theoretical max based on PSU input power and 100% efficiency) of current on average while the PC with a bare-bones 250W SMPS will use around 120W in idle state. That's a differential of 30-40 times. that means even if the raspberry pi is kept on 24x7x365, it will consume 35KW of power  per year (or probably Rs. 200 per year) with PC coming in at 30-40 times this number.
  3. Software - A raspberry pi 2 with its 1 Ghz quad core CPU, 1 GB RAM and 32 GB flash and Wifi/Ethernet will most likely be able to handle all content consumption tasks [browsing, email, social networking, chat, Audio-Video streaming (including 1080p) and playback]. And it will be able to do basic content creation task like image editing, blogging, word processing) etc decently. The only drawback of the Raspberry pi system would be heavy weight content creation like bulk image processing, video editing, heavy games, 3D graphics & visualization) and so on which anyways MOST consumers do not engage in on a routine (if at all) basis. Today the linux ecosystem has developed to an extent where it can compete well for basic work in point 93) with any Windows application or in other words it can satisfy the most common need. Even so, the desktop application load times are a little too much for comfirt, though after loading many work fine as long as they work in memory.
  4. Storage & Distribution Trends - Flash storage is now generally preferred over magnetic disk storage because of performance benefits. It includes flash cards, flash drives and SSDs which generally occupy less power and footprint than 3.5" HDDs. Optical media was earlier used to distribute content, but now on demand downloads, streaming  (multimedia) and USB pen drives have replaced them in the age of net enabled devices, rendering them almost obsolete. Both HDDs and Optical drives are archival storage mediums, not working storage areas which then are addon peripherals to be connected on demand and therefore outside the core PC.
  5. Graphics capability - While their is a place of Discrete Graphics (GPU), Integrated graphics or graphics chip on main motherboard have evolved to a point where they can handle common user activity with ease. Only for the more intense graphics, 3D, Gaming and massively parallel applications do discrete Graphics provide any perceivable benefit. Most users anyways do not indulge in these.
  6. HDMI - The integration of sound with display by means of an HDMI port, have eliminated the need of integrated or discrete sound cards. More so in the age when speakers are getting integrated in display monitors. A similar trend is integration of webcam in monitors connected to the PC using USB interface. All this again means reduction of ports and motherboard circuitry.

All the above have combined to create a disruptive shockwave in the PC space. We must change our thinking of what a PC is. And with increasing volumes, core counts and performance of ARM based SOCs in mobile devices, The cost is going down and the performance is improving at a given price point. 

The PC (with it Intel & Microsoft) is thus being presented with a big challenge in the form of the Raspberry Pis, Odroid C1s along with the Linux Operating system. increasing mobile device sales are driving down cost and increasing capabilities (The pi 2 is 6 times as powerful as the previous generation !!!) in the ARM ecosystem. Again its easy to wash it off by saying that for a PC you necessarily need windows OS, Graphics cards, Sound cards, etc but for an entry level system this assertion may not be universally true. This opens a market for ARM based PCs.

I would not be surprised if we would see a  2-3 Ghz Raspberry pi with 8 cores and powerful GPU in 3 -5 years  time, which can make the PC completely redundant like mainframes were done in by PCs in the last century. Most applications are anyway getting redesigned for multi-core rather than clock speed scale-up. Currently speed is a gap everywhere. The RAM is slower in SBcs and most boot and operate with SD cards which lag HDDs and SSDs way behind in performance (Even on linux of not the boot up time, the application load time ofd heavy applications like Browser, Libreoffice, Gimp, etc is just too much. But once they load and work in memory they are reasonably useful). But time may fix this  sooner than we expect.  And all this at $35 for the board and almost free for the software ;-)) 

And its also possible that mobiles and tablets will morph into the PC chassis with strong content creation and consumption capabilities.  In either case it means turbulent times for x86/x64 platform (and along with it Intel & Microsoft). The Intel x86 & windows platform PC is getting pushed into a Niche by ARM & Linux just the Wintel combine pushed UNIX and UNIX pushed mainframes during their dawn.

Prove me wrong Wintel !!!