Monday, November 24, 2014

Is India ready for eGovernance?

Since opening of Indian economy to the world, and proliferation of Internet all over, we are seeing more and more services provided by the government is becoming web centric. It means a lot, at least to the technology savvy young public, as they are able to get information literally on their finger tips by accessing very useful information on their smart phones and laptops.

But, as a old time Information Technology (IT) professional, I notice that many a times, implementation by government departments leaves much to be desired. Most of government websites are designed and maintained by National Informatics Centre (NIC).

I'll give some example to illustrate, what I mean.

1. Scant regard for IT Security

I will be able to best illustrate this point by showing screen-shots of few prominent government websites.

a. Error shown below is for a page on NSDL site that take you to a page that allows you to make TDS payments under Form 280, 281 etc. What it means that this page does not have valid SSL certificate. Certificate installed on this page is for other two sites.

The landing page has a valid SSL certificate but in a round about manner using SAN value.

Possibly this page has become obsolete over a period of time but it is still there on the NSDL website and still gives error shown below.


This 'Mismatched Address' SSL error on Aadhaar/UIDAI site is really not expected. The error is due to improper usage of a wildcard SSL certificate.


b. Error shown below is seen while doing 'Upload' of TDS return after logging in to eTDS/TCS website with proper credentials. It basically means that Jar file (Java executable file/application) is not digitally signed. If it was digitally signed using Code Signer Certificates for Java, possibly this error would not have appeared.


How does a user - who would be a accountant or tax consultant would react to such an error message. He would simply learn to 'ignore' all such errors and get his work done, somehow. But this tendency of ignoring errors could land him in great trouble if he is redirected to a phising site - which would steal all his confidential data and possibly install Trojan or a malware that would make his PC a part of a botnet.

Also, using Java applications for websites is considered very dangerous because of numerous vulnerabilities discovered over so many years. Here is an informative article on - whether one should uninstall Java.

2. Poor Design and coding of Web Applications
One has to find out by trial and error as to how to make certain sites work.

a. Website doesn't accept Amount with decimal points: Check Challan Status link of NSDL site (see below) does not like if you enter amount with decimal points! See screen-shot below, where if we enter amount with decimal points it gives error saying 'Please enter valid amount'.

Is it easy for any one to guess that he is not supposed to type in decimal point and two zeros after that?


In fact, few years back there was a site which required decimal point and two zeros to be compulsorily added to a number to make it work. If decimal and zeros were missing then system would throw an error.

b. Many sites don't accept valid Email ids: I have noticed many instances where a website or Excel Form will not accept certain valid Email id. For example, Form 280 (TDS) payment website does not accept an Email id if it has dash "-" in the Email id. Same issue is with Excel Form by Maharashtra VAT Departments for VAT returns. For example if Email id is abc@pqr-india.com it will be rejected, though prq-india.com could be a real registered domain.

Some sites while giving an error - unnecessarily resets values of certain fields without any reason. Below given example of Form-281 for TDS payments is a good example of such poorly designed system.

c. Aadhaar side does not accept valid 12 digit Aadhaar numbers: This error on Aadhaar site is craziest of all.



d. File Attachments sent by CPC, Bangalore (Income tax) are without Date and Time Stamp: Whenever I get any Email with attachments from CPC, Bangalore the file attachments are not having any date and time stamp. So, one can never be sure when such files were created by them in first place. Also, another problem is that while backing up such undated files backup software does not like and throws warnings because it works on date & time stamp to find out which is the latest file.

It is left to anybody's guess as to while should an Income Tax Assessment order which is digitally signed by the assessment officer should be undated.

e. Maharashtra VAT department's website and systems do not allow change of registered Email id, no matter what you do: We had registered vsnl.com Email id with Sales Tax (now VAT) department many years back. But, over a period of time we switched to a mail id using our own company domain. We changed registered mail id at most places except with Maharashtra VAT department.

We gave feedback on website saying our Email id has changed but nothing happened. We wrote grievance form after logging in to their website - nothing happened. We called helpdesk they could not help. We wrote letter and hand-delivered to the assessment officer - nothing happened. Once I had a chance to meet VAT Commissioner and I told him about it. His reaction was as if I was taking up a silly issue! And he did not take note of it for correction. May be that RTI may work.

But, result is that we don't get any circulars/ notices / information mails from VAT department. Is 'Ignorance a bliss'? Perhaps 'yes' but ultimately No.

More on this later.

Tuesday, May 27, 2014

Extracting E-Mail addresses from a file or a web page using Microsoft Excel

Recently I needed to extract E-mail addresses from a range of cells in an Excel file. I looked around for a solution on Internet.

After going through number of solutions, I hit upon a very elegant solution that could be used easily for extracting Email addresses not only from Excel file but from any web page or a word file or anything that can be selected and copied and pasted in to a Excel sheet. The solution is described below.

It's essentially a short VBA (Visual Basic for Applications) code (also called a program) to create a custom function in Excel named FindEmailAddresses(Cell1:Cell2). Where range Cell1:Cell2 would contain the text from where we wish to extract Email addresses.

Assuming we have already created the function (explained later below) - here is a description of how to use it.
1. Assuming cells A1 to C5 contains the text from where we want to extract email ids
2. Insert a new column at A1 - so that original content of A1 to C5 will shift to B1 to D5.
3. Now, select Cells A1 to A4 - where the extracted Email ids would come. Here, we are selecting only 4 rows because we know that number of extracted Email ids will be less than 4. If they are more then select more number of rows.
4. While keeping the selection, in Cell A1 type the function =FindEmailAddresses(B1:D5), as shown below.
5. Now press Ctrl + Shift + Enter. Note that this step is crucial. By doing this, formula in A1 will get copied to all cells up to A4.



A
B
C
D
1
=FindEmailAddresses(B1:C5)
Hello world
Well just a empty cell
Nothing here
2
=FindEmailAddresses(B1:C5)
There is no email here
There is an email id:prakash@gmail.com
Here also nothing
3
=FindEmailAddresses(B1:C5)
Excel is very good tool
Practically this is useless

4
=FindEmailAddresses(B1:C5)
Her email is meena@yahoo.com but it changed
Nothing here
Well this Is good
5

Nothing

Theoretically this is empty

You will the extracted Email ids from B1:D5 in A1:A4 as shown below.
 
A
B
C
D
1
prakash@gmail.com
Hello world
Well just a empty cell
Nothing here
2
meena@yahoo.com
There is no email here
There is an email id:prakash@gmail.com
Here also nothing
3
#NA
Excel is very good tool
Practically this is useless

4
#NA
Her email is meena@yahoo.com but it changed
Nothing here
Well this Is good
5

Nothing

Theoretically this is empty

If all cells A1:A4 shows #VALUE! then save the file and upon opening the file you will see a warning saying 'Macros are disabled' Click on 'Enable Macro' and you will see the Emails ids in A1 to A4. If number of Email ids are less than 4 then you will see #NA in remaining cells. If they are more then select A1:A4 and press 'Del' key. Now, select more number of cells in step-3 above and continue. 
 
Note that if the range B1:D5 has only one Email id then all cells A1:A4 gets filled up with same Email id.

How do I extract Email addresses from my Gmail or Outlook application?
Well, this technique can be used only for extracting Email addresses from a single page. If you wish to extract Email addresses from your Email application - you will have to get programs that are developed specifically to do this work.

How to add user defined function FindEmailAddresses
Before you can use above function, you will have to add the user defined function using following procedure in Excel. It works with all versions of Excel from 2003 to 2013.

  1. Press Alt-F11 to open Visual Basic editor (while you are in the file containing the Email addresses)
  2. On top menu under Insert click on Module.
  3. Copy and paste the user defined function (given below) into new module window.
  4. Exit Visual Basic editor by clicking on 'x'
  5. Use the FindEmailAddresses function in Excel file, as explained above.

Visual Basic for Applications (VBA) Code:
Copy and Paste following code (program) in Visual Basic editor of Excel.
-----------------------------------------
Function FindEmailAddresses(rng As Range) As Variant()
Dim Temp As String, Cell As Range, EM() As Variant
ReDim EM(0)
For Each Cell In rng
Temp = Cell.Value
Do While InStr(Temp, "@")
   EM(UBound(EM)) = GetEmailAddress(Temp)
   Temp = Replace(Temp, "@", "", 1, 1)
   ReDim Preserve EM(UBound(EM) + 1)
   Loop
Next
ReDim Preserve EM(UBound(EM) - 1)
FindEmailAddresses = WorksheetFunction.Transpose(EM)
End Function

Function GetEmailAddress(ByVal S As String) As String
Dim x As Long, AtSign As Long
Dim Locale As String, DomainPart As String
Locale = "[A-Za-z0-9.!#$%&'*/=?^_`{|}~+-]"
Domain = "[A-Za-z0-9._-]"
AtSign = InStr(S, "@")
For x = AtSign To 1 Step -1
    If Not Mid(" " & S, x, 1) Like Locale Then
       S = Mid(S, x)
    If Left(S, 1) = "." Then S = Mid(S, 2)
       Exit For
    End If
Next x
AtSign = InStr(S, "@")
For x = AtSign + 1 To Len(S) + 1
    If Not Mid(S & " ", x, 1) Like Domain Then
       S = Left(S, x - 1)
    If Right(S, 1) = "." Then S = Left(S, Len(S) - 1)
       GetEmailAddress = S
       Exit For
    End If
Next x
End Function
----------------------------------------
Explanatory Notes
This notes are for those who wants to know more about technique used in this solution and further improvements.
1.  VBA code uses simple string comparison to find out presence of @ within the content of a cell. If it found one it found out word before and after it and extracted it as a Email address. This is done in GetEmailAddress function
2. It uses a clever Array Formula technique to store the extracted Email addresses one each in each cell. This is done by Ctrl+Shift+Enter key combination.
3. With some modifications this technique can be used to sort the extracted Email address domain wise (@yourcompany.com).

If you find it useful or have any suggestions - do write back.

Covid Resource for Second Wave

Initial part of this blog has resources useful for all of India. Later part is dedicated to resources for Mumbai city.