<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	>
<channel>
	<title>Comments on: Opening a file in Python</title>
	<atom:link href="http://halfcooked.com/blog/2008/05/09/opening-a-file-in-python/feed/" rel="self" type="application/rss+xml" />
	<link>http://halfcooked.com/blog/2008/05/09/opening-a-file-in-python/</link>
	<description>Wherein I write some stuff  that you may like to read. Or not, its up to you really.</description>
	<pubDate>Fri, 29 Aug 2008 09:14:39 +0000</pubDate>
	<generator>http://wordpress.org/?v=2.6</generator>
		<item>
		<title>By: Steve Kryskalla</title>
		<link>http://halfcooked.com/blog/2008/05/09/opening-a-file-in-python/#comment-32785</link>
		<dc:creator>Steve Kryskalla</dc:creator>
		<pubDate>Fri, 09 May 2008 17:16:05 +0000</pubDate>
		<guid isPermaLink="false">http://halfcooked.com/blog/?p=64#comment-32785</guid>
		<description>I think you could also turn this type of function into a decorator that will automatically coerce an argument of the wrapped function.</description>
		<content:encoded><![CDATA[<p>I think you could also turn this type of function into a decorator that will automatically coerce an argument of the wrapped function.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Steve Kryskalla</title>
		<link>http://halfcooked.com/blog/2008/05/09/opening-a-file-in-python/#comment-32784</link>
		<dc:creator>Steve Kryskalla</dc:creator>
		<pubDate>Fri, 09 May 2008 17:12:16 +0000</pubDate>
		<guid isPermaLink="false">http://halfcooked.com/blog/?p=64#comment-32784</guid>
		<description>Another version:

def getfile(obj):
....if all(hasattr(obj, attr) for attr in ['read', 'seek', 'write']):
........return obj
....elif isinstance(obj, basestring) and os.path.isfile(obj):
........return file(obj)
....else:
........raise ValueError("Not a file-like object or valid path.")</description>
		<content:encoded><![CDATA[<p>Another version:</p>
<p>def getfile(obj):<br />
&#8230;.if all(hasattr(obj, attr) for attr in ['read', 'seek', 'write']):<br />
&#8230;&#8230;..return obj<br />
&#8230;.elif isinstance(obj, basestring) and os.path.isfile(obj):<br />
&#8230;&#8230;..return file(obj)<br />
&#8230;.else:<br />
&#8230;&#8230;..raise ValueError(&#8221;Not a file-like object or valid path.&#8221;)</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Snazz</title>
		<link>http://halfcooked.com/blog/2008/05/09/opening-a-file-in-python/#comment-32779</link>
		<dc:creator>Snazz</dc:creator>
		<pubDate>Fri, 09 May 2008 14:43:17 +0000</pubDate>
		<guid isPermaLink="false">http://halfcooked.com/blog/?p=64#comment-32779</guid>
		<description>You could use named arguments

def my_function(file_name=None, file_object=None):
....if file_object == None:
........file_object = open(file_name', 'r')
....# Do stuff with file_object

my_function(file_name='test.txt')
my_function(file_object=open('test.txt','r')</description>
		<content:encoded><![CDATA[<p>You could use named arguments</p>
<p>def my_function(file_name=None, file_object=None):<br />
&#8230;.if file_object == None:<br />
&#8230;&#8230;..file_object = open(file_name&#8217;, &#8216;r&#8217;)<br />
&#8230;.# Do stuff with file_object</p>
<p>my_function(file_name=&#8217;test.txt&#8217;)<br />
my_function(file_object=open(&#8217;test.txt&#8217;,'r&#8217;)</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Dave Kirby</title>
		<link>http://halfcooked.com/blog/2008/05/09/opening-a-file-in-python/#comment-32777</link>
		<dc:creator>Dave Kirby</dc:creator>
		<pubDate>Fri, 09 May 2008 14:09:59 +0000</pubDate>
		<guid isPermaLink="false">http://halfcooked.com/blog/?p=64#comment-32777</guid>
		<description>If the code is only going to read the data in the file a line at a time then I would say it is far better to specify that the function takes an iterable sequence of strings, so that the client code is responsible for creating the file object and passing it in.  This way the client could equally give the function a urlib object, or a StringIO object, or a list of strings, or a generator, or anything else that the user can come up with.

This not only makes the function more flexible and useful, it also makes it much easier to test.  I have lots of tests where the production version of the code takes a file object, but the unit tests pass in something like this:

testdata = '''
lines 
of test
data
'''.splitlines()

result = myFunction(testdata)
assert result == expected


This makes the test self-contained, instead of being split across lots of supplementary files.
 
Having a function that takes either a string or a filename is a code smell IMHO - if you must allow either then have two separate functions, the first takes a file (or string iterable), and the second takes a filename, opens the file and calls the first function.

e.g.
def doStuff(fileObj):
        # do stuff with the file

def openFileAndDoStuff(filename):
    doStuff(open(filename))</description>
		<content:encoded><![CDATA[<p>If the code is only going to read the data in the file a line at a time then I would say it is far better to specify that the function takes an iterable sequence of strings, so that the client code is responsible for creating the file object and passing it in.  This way the client could equally give the function a urlib object, or a StringIO object, or a list of strings, or a generator, or anything else that the user can come up with.</p>
<p>This not only makes the function more flexible and useful, it also makes it much easier to test.  I have lots of tests where the production version of the code takes a file object, but the unit tests pass in something like this:</p>
<p>testdata = &#8221;&#8217;<br />
lines<br />
of test<br />
data<br />
&#8221;&#8217;.splitlines()</p>
<p>result = myFunction(testdata)<br />
assert result == expected</p>
<p>This makes the test self-contained, instead of being split across lots of supplementary files.</p>
<p>Having a function that takes either a string or a filename is a code smell IMHO - if you must allow either then have two separate functions, the first takes a file (or string iterable), and the second takes a filename, opens the file and calls the first function.</p>
<p>e.g.<br />
def doStuff(fileObj):<br />
        # do stuff with the file</p>
<p>def openFileAndDoStuff(filename):<br />
    doStuff(open(filename))</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Paddy3118</title>
		<link>http://halfcooked.com/blog/2008/05/09/opening-a-file-in-python/#comment-32775</link>
		<dc:creator>Paddy3118</dc:creator>
		<pubDate>Fri, 09 May 2008 13:55:09 +0000</pubDate>
		<guid isPermaLink="false">http://halfcooked.com/blog/?p=64#comment-32775</guid>
		<description>Isn't the try/except to be preferred over isinstance/hasattr? Or will it lead to bad logic?
The OP's case will assume its a file object if it is not open-able; as opposed to assuming it is a file object if it is an instance of basestring.

I suspect that in this case it amounts to the same thing if open itself only works if given an instance of basestring, but style-wise I thought it was preferable to limit the use of isinstance/hasattr?

Hell, I'm nit-picking. either would work!

- Paddy.</description>
		<content:encoded><![CDATA[<p>Isn&#8217;t the try/except to be preferred over isinstance/hasattr? Or will it lead to bad logic?<br />
The OP&#8217;s case will assume its a file object if it is not open-able; as opposed to assuming it is a file object if it is an instance of basestring.</p>
<p>I suspect that in this case it amounts to the same thing if open itself only works if given an instance of basestring, but style-wise I thought it was preferable to limit the use of isinstance/hasattr?</p>
<p>Hell, I&#8217;m nit-picking. either would work!</p>
<p>- Paddy.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Michael Foord</title>
		<link>http://halfcooked.com/blog/2008/05/09/opening-a-file-in-python/#comment-32773</link>
		<dc:creator>Michael Foord</dc:creator>
		<pubDate>Fri, 09 May 2008 12:02:32 +0000</pubDate>
		<guid isPermaLink="false">http://halfcooked.com/blog/?p=64#comment-32773</guid>
		<description>I agree that in this circumstance a type check ( isinstance(file_name_or_object, basestring) ) is perfectly acceptable. ConfigObj does this. (Particularly as there is no 'string protocol' and so any filename will have to be passed in as a real string or at least a subclass.)

Michael</description>
		<content:encoded><![CDATA[<p>I agree that in this circumstance a type check ( isinstance(file_name_or_object, basestring) ) is perfectly acceptable. ConfigObj does this. (Particularly as there is no &#8217;string protocol&#8217; and so any filename will have to be passed in as a real string or at least a subclass.)</p>
<p>Michael</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: vArDo</title>
		<link>http://halfcooked.com/blog/2008/05/09/opening-a-file-in-python/#comment-32772</link>
		<dc:creator>vArDo</dc:creator>
		<pubDate>Fri, 09 May 2008 12:02:16 +0000</pubDate>
		<guid isPermaLink="false">http://halfcooked.com/blog/?p=64#comment-32772</guid>
		<description>@me: why do you use 'hasattr' with 'elif'? Why not simply use 'isinstance(file_name_or_object,file)'? Is it faster? I'm asking because many classes can have 'read' method, not necessarily being 'file-like'.</description>
		<content:encoded><![CDATA[<p>@me: why do you use &#8216;hasattr&#8217; with &#8216;elif&#8217;? Why not simply use &#8216;isinstance(file_name_or_object,file)&#8217;? Is it faster? I&#8217;m asking because many classes can have &#8216;read&#8217; method, not necessarily being &#8216;file-like&#8217;.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Steven</title>
		<link>http://halfcooked.com/blog/2008/05/09/opening-a-file-in-python/#comment-32771</link>
		<dc:creator>Steven</dc:creator>
		<pubDate>Fri, 09 May 2008 11:51:55 +0000</pubDate>
		<guid isPermaLink="false">http://halfcooked.com/blog/?p=64#comment-32771</guid>
		<description>What about this? It would depend of course in how "filelike" the object has to be. I don't know whether it is considered the most idiomatic, but it's used for example in xml.etree.Elementree for the parse function.

if not hasattr(file_name_or_object, 'read'):
    filename_or_object = open(file_name_or_object)
return filename_or_object</description>
		<content:encoded><![CDATA[<p>What about this? It would depend of course in how &#8220;filelike&#8221; the object has to be. I don&#8217;t know whether it is considered the most idiomatic, but it&#8217;s used for example in xml.etree.Elementree for the parse function.</p>
<p>if not hasattr(file_name_or_object, &#8216;read&#8217;):<br />
    filename_or_object = open(file_name_or_object)<br />
return filename_or_object</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: me</title>
		<link>http://halfcooked.com/blog/2008/05/09/opening-a-file-in-python/#comment-32770</link>
		<dc:creator>me</dc:creator>
		<pubDate>Fri, 09 May 2008 11:37:53 +0000</pubDate>
		<guid isPermaLink="false">http://halfcooked.com/blog/?p=64#comment-32770</guid>
		<description>def my_function(file_name_or_object):
    # parameter is string
    if isinstance(file_name_or_object, (str, unicode)):
        return open(file_name_or_object)
        
    # parameter is an file like object that has at least a read method
    elif hasattr(file_name_or_object, "read"):
        return file_name_or_object</description>
		<content:encoded><![CDATA[<p>def my_function(file_name_or_object):<br />
    # parameter is string<br />
    if isinstance(file_name_or_object, (str, unicode)):<br />
        return open(file_name_or_object)</p>
<p>    # parameter is an file like object that has at least a read method<br />
    elif hasattr(file_name_or_object, &#8220;read&#8221;):<br />
        return file_name_or_object</p>
]]></content:encoded>
	</item>
</channel>
</rss>
