Getting Sum, Count and Distinct Count of a file

Hi all this is a UNIX question.
I have a large flat file with millions of records.
col1|col2|col3
1|a|b
2|c|d
3|e|f
3|g|h
footer****
I am supposed to calculate the sum of col1 =9, count of col1 =4, and distinct count of col1 =c3
I would like it if you avoid external commands like AWK. Also, can we do the same by creating a function?
Please bear in mind that the file is huge
Thanks in advance

This sounds like homework for a shell class, but here goes. Save into a file, maybe "doit". Run it like this:
$ ./doit < data
<snip>
#!/bin/sh
got0=0
got1=0
got2=0
got3=0
got4=0
got5=0
got6=0
got7=0
got8=0
got9=0
sum=0
cnt=0
IFS='|'
while read c1 c2 c3 junk; do
# Sum and count
echo "c1=${c1}"
case "${c1}" in
[0-9] )
     sum=$(expr ${sum} + ${c1})
     cnt=$(expr ${cnt} + 1)
esac
# Distinct
case "${c1}" in
0 )     got0=1;;
1 )     got1=1;;
2 )     got2=1;;
3 )     got3=1;;
4 )     got4=1;;
5 )     got5=1;;
6 )     got6=1;;
7 )     got7=1;;
8 )     got8=1;;
9 )     got9=1;;
esac
done
echo "cnt=${cnt}"
echo "sum=${sum}"
echo "distinct="$(expr $got0 + $got1 + $got2 + $got3 + $got4 + $got5 + $got6 + $got7 + $got8 + $got9)
<snip>

Similar Messages

  • Can I get a count of all files in a folder, including in subfolders?

    I know the various ways to get a count of items in a folder. This gives me the number of files and subfolders. Is there a way to include the number of files in the subfolders in the total folder count?

    I also found that, here: http://superuser.com/questions/198817/recursively-count-all-the-files-in-a-direc tory
    That will give you the count of just files, excluding the folders.
    If you want to count the non-hidden files, you can do this:
    find path/ -type f ! -name ".*" -flags nohidden | wc -l

  • Efficient way get FCE4 Log and Transfer to read .mts files stored on drive?

    Hi All
    I've searched the FCE discussion forum and not found an answer verified by more than one user to this question: What is an efficient way to get FCE4 (via the Log and Transfer window) to see .mts files from an AVCHD camera stored on a drive (NOT via the camera -- directly from the drive)?
    I am trying to plan the most space-efficient system possible for storing un-transcoded .mts files from a Panasonic AG-HMC151 on a harddrive so that I can easily ingest them into FCE4. I am shooting a long project and I want to be able to look at .mts files so that I can decide which ones to transcode to AIC for the edit.
    Since FCE4 cannot see .mts files unless they have their metadata wrapper the question is really 'how do I most efficiently transfer .mts files from the camera to a storage harddrive with their metadata wrappers so that FCE4 can see them via the log and transfer window?'
    Nick Holmes, in a reply in this thread
    http://discussions.apple.com/thread.jspa?messageID=10423384&#10423384
    gives 2 options: Use the Disk Utility to make a disk image of the whole SD card, or copy the whole contents of the card to a folder. He says he prefers the first option because it makes sure everything on the card is copied.
    a) Have other FCE users done this successfully and been able to read the .mts files via Log and Transfer?
    In a response to this thread:
    http://discussions.apple.com/thread.jspa?messageID=10257620&#10257620
    wallybarthman gives a method for getting Log and Transfer to see .mts files that have been stored on a harddrive without their metadata wrappers by using Toast 9 or 10.
    b) Have any other FCE4 users used this method? Does it work well?
    c) Why is FCE4 unable to see .mts files without their metadata wrappers in the Log and Transfer window? Is it just a matter of writing a few lines of code?
    d) Is there an archiving / library app. on the market that would allow one to file / name / tag many .mts clips and view them prior to transcoding into space-hungry AIC files in FCE?
    Any/all help would be most gratefully received!

    I have saved the complete file structure on DVD as a backup, but have not needed to open them yet. But I will add this. As I understand the options with Toast you are infact converting the video to AIC or something like it. I haven't looked into it myself, but I can't imagine the extra files are that large, but maybe there are significant, I don't know. The transcoded files are huge in comparison to the AVCHD file.
    A new player on the scene for AVCHD is Clipwrap 2.0. As I understand this product. It rewraps the AVCHD into a wrapper the Quicktime can open and play. This is with the MTS files only, the rest of the file structure is not needed. The rewrap is much faster that the transcode to AIC. So you have the added benefit of being able to play the files as well as not storing the extra files. The 2.0 version (which is for AVCHD) was just recently released. I haven't tried it and don't personally know of anyone who has. You might want to try this, there is a trial version as I recall.

  • Getting a G4 and MacBook Pro to transfer files via ethernet

    Thank you for the help on setting up file sharing. File sharing looks to be working on both computers (now) but it prompts a new question: I now have a problem with the ethernet connection. I unlugged my cable modem ethernet cable and connected it to the ethernet port on the back of the G4 and to the MacBook Pro ethernet slot (as recommended in Mac Help). I opened File Sharing in both computers. I've followed the steps in Mac Help to get the MacBook Pro to recognize the G4 so I can start transfering files to the MBP. No success getting them to "see" each other.
    It appears that the MacBook Pro wants me to do it over the Internet but Mac Help is telling me I can do it directly from computer to computer since I've got them connected via the ethernet cable. (Step 3. is "connect to server" and, once connected, there is a "browse" option). My hope is to use the G4 as the "server" but I can't determine how to make the MBP connect to the G4 so I can transfer files. (each computer has a different name.)
    Any suggestions are appreciated.
    Thank you.
    PowerMac G4 Mac OS 9.2.x MacBook Pro 17" laptop 2.16 GHz, 2 GB 667 MHz DDR2 SDRAM

    Thanks for the suggestion. In following your recommendation I get a window on my MBP that says, "Connection failed. The server may not exist or it is not operational at this time. Check the server name of IP address."
    In trying to "connect," I've inserted the ethernet cable into the port on the MBP and the G4. So, I'm not connected to the cable modem. Since there is only one ethernet port on each computer, that seemed sensible. Maybe not.
    On the MBP I also tried, in the Sharing folder under "Internet," there is an option to share my built-in MBP ethernet connection with the built-in G4 ethernet. When I tried that a message popped up saying, "if you turn on this port, your ISP might terminate your service to prevent you from disrupting its network. In some cases (if you use a cable modem - which I do - you might unintentionaly affect the network settings of your ISP and violate the terms of your service agreement."
    On the G4 I've enabled file sharing and enabled file sharing clients to connect over TCP/IP.
    I don't know if this is relevant, but my TCP/IP on the G4 is configured to connect via Ethernet and configure using DHCP server.
    Thank you.
    The most consistent way I have found, is to use the
    IP address. The Browse function should work, but
    sometimes it does not.
    If you want to connect from the MBP to the G4 (G4's
    volume shows up on MBP desktop), you need to know the
    IP address of the G4. Go to the TCP/IP control panel
    (on the G4). You should see the G4's IP address
    there. Note it down.
    On the MBP, go to Menu -> Go -> Connect to Server...
    as you have already done. Instead of hitting the
    Browse, just type in that IP address where it says
    Server Address. Now hit the Connect button.
    You should be prompted for user name and password for
    the G4.
    PowerMac G4 Mac OS 9.2.x MacBook Pro 17" laptop 2.16 GHz, 2 GB 667 MHz DDR2 SDRAM

  • Broken Korean and Japanese fonts on music files.

    I get broken Korean and Japanese fonts on music files.
    It shows up broken as a file and on itunes when I play it.
    How can I have this fixed?
    I have 3 other laptops that are not Mac and on those laptops it shows up just fine.
    Tried adding a language pack or something still doesn't work.
    Its been like this when I purchased it.

    I get broken Korean and Japanese fonts on music files.
    It shows up broken as a file and on itunes when I play it.
    How can I have this fixed?
    I have 3 other laptops that are not Mac and on those laptops it shows up just fine.
    Tried adding a language pack or something still doesn't work.
    Its been like this when I purchased it.

  • Get Column sum, count in sql developer

    Dear All seniors,
    I need help in oracle sql developer that after select statement when we got our result in below rows and coloums,
    then how to get its sum, count, etc.
    as in pl sql developer we simply right click at a column and then click on the desired function.
    is this facility also have in sql developer?

    We've kicked around the idea of doing this for awhile. I believe there's already an item in the Exchange if you want to up vote it. The challenge being if it's a result set which hasn't been fully fetched, doing an aggregate would require one - and that could be costly. Of course folks live in the grids so I see the value in this type of feature as well.
    As a workaround, you could of course export your data to CSV or XLS and do the calculations in your favorite spreadsheet software.

  • To get the count of records and able to access the column value in a single

    Hi
    Is there any way to get the number of records in the query and access the column values
    e.g
    select count(*)
    from
    (SELECT department, COUNT(*) as "Number of employees"
    FROM employees
    WHERE salary > 25000
    GROUP BY department ) a
    This wil only get the Count, if i want to access each row from the inline view how can i do that.

    Your question is not clear.
    Are you looking for total record count as well as count by department ?
    Something like this?
    SQL>
    SQL> with temp as
      2  (
      3  select 1 dept ,10000 sal from dual union
      4  select 1 dept ,25100 sal from dual union
      5  select 1 dept ,30000 sal from dual union
      6  select 1 dept ,40000 sal from dual union
      7  select 2 dept ,10000 sal from dual union
      8  select 2 dept ,25100 sal from dual union
      9  select 2 dept ,30000 sal from dual union
    10  select 2 dept ,40000 sal from dual )
    11  select count(*) over( partition by 1 ) total_count,dept,
    12  count(*) over(partition by dept) dept_cnt  from temp
    13  where sal>25000;
    TOTAL_COUNT       DEPT   DEPT_CNT
              6          1          3
              6          1          3
              6          1          3
              6          2          3
              6          2          3
              6          2          3
    6 rows selected
    SQL>

  • How to get the count of items(folders or documents) in a SharePoint document library and for a folder in any SharePoint document library?

    I need to get a count of documents and folders in a selected document library(the item for EventRecievers for event deleting) and the count of documents in a selected folder(the item for EventRecievers for event deleting) to determine whether the item(document
    library or folder) is empty or not, as I need to use this condition for one of my custom EventReceivers.
    Can you please suggest me the code to do the same?

    Hi.
    Try this:
    class Program
    static void Main(string[] args)
    using (SPSite site = new SPSite("http://z2012net"))
    using (SPWeb web = site.OpenWeb())
    SPList l = web.Lists["List001"];
    List<FolderInfo> res = new List<FolderInfo>();
    getListCount(l.RootFolder, ref res);
    private static void getListCount(SPFolder sPFolder, ref List<FolderInfo> res)
    SPQuery query = new SPQuery();
    query.Folder = sPFolder;
    res.Add(new FolderInfo() { ItemCount = sPFolder.ItemCount, Name = sPFolder.Name, Url = sPFolder.Url });
    SPListItemCollection items = sPFolder.ParentWeb.Lists[sPFolder.ParentListId].GetItems(query);
    foreach (SPListItem item in items)
    if (item.FileSystemObjectType == SPFileSystemObjectType.Folder)
    SPFolder f = sPFolder.ParentWeb.GetFolder(item.UniqueId);
    getListCount(f, ref res);
    class FolderInfo
    public string Url { get; set; }
    public string Name { get; set; }
    public int ItemCount { get; set; }
    Regards,
    Bubu
    http://zsvipullo.blogspot.it
    Please mark my answer if it helped you, I would greatly appreciate it.

  • How to get the table_name and its count(*) in a SQL

    Hi,
    Can anybody tell me how to write a sql to get the table_name and its count(*) in a SQL:
    Output should be:
    table_name count(*)
    XXX 261723
    YYY 3343
    Regards,
    G. Rajakumar.

    hello
    there r a lot ways
    i'll suggest u two of them
    1) the following dynamic sql procedure
    DECLARE
    TYPE array_type IS TABLE OF VARCHAR(30);
    TYPE cur_typ IS REF CURSOR;
    c1 cur_typ;
    count1 integer;
    tab_arr array_type;
    querystr varchar2(200);
    begin
    SELECT table_name bulk collect into tab_arr FROM sys.all_all_tables ;
    FOR I IN tab_arr.first..tab_arr.last LOOP
    DBMS_OUTPUT.PUT(TAB_ARR(I));
    querystr := 'select count(*) from ' ||TAB_ARR(I);
    open c1 for querystr;
    fetch c1 into count1;
    EXIT WHEN c1%NOTFOUND;
    dbms_output.put_line(count1);
    END LOOP;
    close c1;
    END;
    2) or use ANALYZE to analyze the tables and get the number of rows in the NUM_ROW column of DBA_TABLES view.
    if u still have any problem mail me at [email protected]
    shalini

  • How to get the count of distinct customer in matrix?

    Hi,
    I want to get the count of distinct customer in matrix at the time of validation event.
    Thanks Regards,

    Hi,
    Please close the thread.by marking the correct answer
    regards,
    Prasad

  • How do you get a word and character count of a document in Pages for Mavericks

    How do you get a word and character count of a document in Pages for Mavericks?

    Hi jonathan,
    I struggled with this one as well, as i'm finishing up a PhD thesis and word counts are incredibly important. Through trial and error i found out that words like 'e.g.' and 'i.e.' count as two words and an ampersand (&) doesn't count at all. That being said, i didn't like the fact that the word count always included footnotes and i was dismayed that i couldn't get an accurate count of words in the main body of my text. That all disappeared yesterday when, by chance, while i was copying a completed chapter and pasting it into my main document, i discovered that Pages can indeed give you an accurate word count excluding the footnotes!  Here's all you need to do:
    1. In pages 5.2, make sure that the Word Count function is first enabled by selecting View --> Show Word Count from the Pages Menu Bar. (If it's already enabled, it will read View --> Hide Word Count, so if that's what it says, then no need to do anything.)
    2. Once enabled, check the Word count that's currently showing at the bottom of the page. That's the word count including your footnotes.
    2. Now, place your cursor anywhere within the current document, then hit command+A (for Select All).
    3. Viola! Your word count now shows the actual number of words within the body of the text only, excluding footnotes!
    Hope that helps!

  • Getting the count of DISTINCT of several columns

    How can i get the count of DISTINCT of several columns?
    SQL> select count(distinct ename) from emp;
    COUNT(DISTINCTENAME)
                      14
    SQL> select count(distinct ename, job) from emp;
    select count(distinct ename, job) from emp
    ERROR at line 1:
    ORA-00909: invalid number of arguments

    Hello,
    You should separate them out like this
    select count(distinct ename), count(distinct job) from emp;Regards

  • Get Record count and text of PreparedStatement

    Hi,
    Is it possible to get a count of the records returned in a ResultSet?
    Also, is it possible to get the text of a query(PreparedStatement). I am using a PreparedStatement, and want to know what the actual query looks like, with the parameters. i.e. Are the parameters being set correctly or not.
    Thanks,
    Dewang

    Is it possible to get a count of the records returned
    in a ResultSet?You will have to count them - I don't know of any other way.
    I am using a
    PreparedStatement, and want to know what the actual
    query looks like, with the parameters. i.e. Are the
    parameters being set correctly or not.You will have to use debugging statements (possibly on both sides - in the Java and in your stored procedures if you are using them). You have to assume that your JDBC driver is managing the conversions properly, as long as you match the data types right. It's really pretty straight-forward, just a little tedious.

  • I created an Execute SQL Task in ssis package to get the counts and store it in the variable and I use the variable in the control flow

    I always get the count as 0 for the variable. I am not sure what I am doing wrong. Please let me know if any of you know the fix.
    Thanks

    Here is the query I used now. Still same result
    SELECT (select  count(*) from usr_all_mbrs where PREMIER_YN = 'Y') AS count1
    SELECT
    (select 
    count(*)
    from usr_all_mbrs
    where PREMIER_YN
    =
    'Y')
    AS count1

  • Need to get total count of a column in the given query

    Hi,
    I have the following query for which i need a total count of distinct concatenated_address. I am trying to use count(distinct adv.concatenated_address) in the below query but because of the group by it does not give me the expected result.
    I am not reusing the same query, in my program again, to get the count as it would affect the performance. This query takes really long to execute and so is there a way to incorporate the count in this single query itself without having to use it twice.
    SELECT DISTINCT (acv.customer_name||','||
    acv.customer_number||','||
    REPLACE(adv.concatenated_address, ',', ' ')||','||
    adv.postal_code||','||
    rct.interface_header_attribute1||','||
    rct.interface_header_attribute6||','||
    rct.creation_date||','||
    rct.trx_date||','||
    aps.due_date ||','||
    SUM(aps.amount_due_original)||','||
    SUM(aps.amount_due_remaining) ||','||
    rct.printing_count ||','||
    TO_DATE(rct.printing_last_printed)||','||
    TO_DATE(rct.printing_original_date)||',') str
    ,acv.customer_id
    ,REPLACE(adv.concatenated_address, ',', ' ') address
    FROM ar_customers_v acv
    ,ar_addresses_v adv
    ,hz_cust_site_uses hcsu
    ,ra_customer_trx rct
    ,ar_payment_schedules aps
    WHERE adv.customer_id = acv.customer_id
    AND hcsu.cust_acct_site_id = adv.address_id
    AND hcsu.site_use_code = 'BILL_TO'
    AND rct.bill_to_customer_id = acv.customer_id
    AND rct.bill_to_site_use_id = hcsu.site_use_id
    AND aps.customer_trx_id = rct.customer_trx_id
    GROUP BY acv.customer_name
    ,acv.customer_number
    ,adv.concatenated_address
    ,adv.postal_code
    ,rct.interface_header_attribute1
    ,rct.interface_header_attribute6
    ,rct.creation_date
    ,rct.trx_date
    ,aps.due_date
    ,rct.printing_count
    ,TO_DATE(rct.printing_last_printed)
    ,TO_DATE(rct.printing_original_date)
         ,acv.customer_id
    ORDER BY acv.customer_id
    ,REPLACE(adv.concatenated_address, ',', ' ')
    Thank you

    try this please
    SELECT COUNT(str),customer_id
    FROM
    (SELECT DISTINCT (acv.customer_name||','||
    acv.customer_number||','||
    REPLACE(adv.concatenated_address, ',', ' ')||','||
    adv.postal_code||','||
    rct.interface_header_attribute1||','||
    rct.interface_header_attribute6||','||
    rct.creation_date||','||
    RCT.TRX_DATE||','||
    aps.due_date ||','|| 
    SUM(aps.amount_due_original)||','||
    SUM(aps.amount_due_remaining) ||','||
    rct.printing_count ||','||
    TO_DATE(rct.printing_last_printed)||','||
    TO_DATE(rct.printing_original_date)||',') str
    ,acv.customer_id
    ,REPLACE(adv.concatenated_address, ',', ' ') address
    FROM ar_customers_v acv
    ,ar_addresses_v adv
    ,hz_cust_site_uses hcsu
    ,ra_customer_trx rct
    ,ar_payment_schedules aps
    WHERE adv.customer_id = acv.customer_id
    AND hcsu.cust_acct_site_id = adv.address_id
    AND hcsu.site_use_code = 'BILL_TO'
    AND rct.bill_to_customer_id = acv.customer_id
    AND rct.bill_to_site_use_id = hcsu.site_use_id
    AND aps.customer_trx_id = rct.customer_trx_id
    GROUP BY acv.customer_name
    ,acv.customer_number
    ,adv.concatenated_address
    ,adv.postal_code
    ,rct.interface_header_attribute1
    ,rct.interface_header_attribute6
    ,rct.creation_date
    ,rct.trx_date
    ,aps.due_date
    ,rct.printing_count
    ,TO_DATE(rct.printing_last_printed)
    ,TO_DATE(RCT.PRINTING_ORIGINAL_DATE)
    ,ACV.CUSTOMER_ID)

Maybe you are looking for